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NOVEL PROTEINS AND NUCLEIC ACIDS ENCODING SAME 

FIELD OF THE INVENTION 

The present invention relates to novel polypeptides that are targets of small 
molecule drugs and that have properties related to stimulation of biochemical or 
physiological responses in a cell, a tissue, an organ or an organism. More particularly, the 
novel polypeptides are gene products of novel genes, or are specified biologically active 
fragments or derivatives thereof. Methods of use encompass diagnostic and prognostic 
assay procedures as well as methods of treating diverse pathological conditions. 
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Eukaryotic cells are characterized by biochemical and physiological processes 
which under normal conditions are exquisitely balanced to achieve the preservation and 
propagation of the cells. When such cells are components of multicellular organisms such 
5 as vertebrates, or more particularly organisms such as mammals, the regulation of the 

biochemical and physiological processes involves intricate signaling pathways. Frequently, 
such signaling pathways involve extracellular signaling proteins, cellular receptors that 
bind the signaling proteins and signal transducing components located within the cells. 
Signaling proteins may be classified as endocrine effectors, paracrine effectors or 

10 autocrine effectors. Endocrine effectors are signaling molecules secreted by a given organ 
into the circulatory system, which are then transported to a distant target organ or tissue. 
The target cells include the receptors for the endocrine effector, and when the endocrine 
effector binds, a signaling cascade is induced. Paracrine effectors involve secreting cells 
and receptor cells in close proximity to each other, for example two different classes of 

15 cells in the same tissue or organ. One class of cells secretes the paracrine effector, which 
then reaches the second class of cells, for example by diffusion through the extracellular 
fluid. The second class of cells contains the receptors for the paracrine effector, binding of 
the effector results in induction of the signaling cascade that elicits the corresponding 
biochemical or physiological effect. Autocrine effectors are highly analogous to paracrine 

20 effectors, except that the same cell type that secretes the autocrine effector also contains the 
receptor. Thus the autocrine effector binds to receptors on the same cell, or on identical 
neighboring cells. The binding process then elicits the characteristic biochemical or 
physiological effect. 

Signaling processes may elicit a variety of effects on cells and tissues including by 
25 way of nonlimiting example induction of cell or tissue proliferation, suppression of growth 
or proliferation, induction of differentiation or maturation of a cell or tissue, and 
suppression of differentiation or maturation of a cell or tissue. 

Many pathological conditions involve dysregulation of expression of important 
effector proteins. In certain classes of pathologies the dysregulation is manifested as 
30 diminished or suppressed level of synthesis and secretion of protein effectors. In other 
classes of pathologies the dysregulation is manifested as increased or up-regulated level of 
synthesis and secretion of protein effectors. In a clinical setting a subject may be suspected 
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of suffering from a condition brought on by altered or mfcffegdflafdMe^aSW-i pr&t&ri* 
effector of interest. Therefore there is a need to assay for the level of the protein effector 
of interest in a biological sample from such a subject, and to compare the level with that 
characteristic of a nonpathological condition. There also is a need to provide the protein 
5 effector as a product of manufacture. Administration of the effector to a subject in need 
thereof is useful in treatment of the pathological condition. Accordingly, there is a need for 
a method of treatment of a pathological condition brought on by a diminished or suppressed 
levels of the protein effector of interest. In addition, there is a need for a method of 
treatment of a pathological condition brought on by a increased or up-regulated levels of 

10 the protein effector of interest. 

Small molecule targets have been implicated in various disease states or 
pathologies. These targets may be proteins, and particularly enzymatic proteins, which are 
acted upon by small molecule drugs for the purpose of altering target function and 
achieving a desired result. Cellular, animal and clinical studies can be performed to 

15 elucidate the genetic contribution to the etiology and pathogenesis of conditions in which 
small molecule targets are implicated in a variety of physiologic, pharmacologic or native 
states. These studies utilize the core technologies at CuraGen Corporation to look at 
differential gene expression, protein-protein interactions, large-scale sequencing of 
expressed genes and the association of genetic variations such as, but not limited to, single 

20 nucleotide polymorphisms (SNPs) or splice variants in and between biological samples 
from experimental and control groups. The goal of such studies is to identify potential 
avenues for therapeutic intervention in order to prevent, treat the consequences or cure the 
conditions. 

In order to treat diseases, pathologies and other abnormal states or conditions in 
25 which a mammalian organism has been diagnosed as being, or as being at risk for 
becoming, other than in a normal state or condition, it is important to identify new 
therapeutic agents. Such a procedure includes at least the steps of identifying a target 
component within an affected tissue or organ, and identifying a candidate therapeutic agent 
that modulates the functional attributes of the target. The target component may be any 
30 biological macromolecule implicated in the disease or pathology. Commonly the target is a 
polypeptide or protein with specific functional attributes. Other classes of macromolecule 
may be a nucleic acid, a polysaccharide, a lipid such as a complex lipid or a glycolipid; in 
addition a target may be a sub-cellular structure or extra-cellular structure that is comprised 
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of more than one of these classes of macromolecule. Orifce%udh *fi MjjffiMstyeeff 
identified, it may be employed in a screening assay in order to identify favorable candidate 
therapeutic agents from among a large population of substances or compounds. 

In many cases the objective of such screening assays is to identify small molecule 
5 candidates; this is commonly approached by the use of combinatorial methodologies to 
develop the population of substances to be tested. The implementation of high throughput 
screening methodologies is advantageous when working with large, combinatorial libraries 
of compounds. 

SUMMARY OF THE INVENTION 

10 The invention includes nucleic acid sequences and the novel polypeptides they 

encode. The novel nucleic acids and polypeptides are referred to herein as NOVX, or 
NOV1, NOV2, NOV3, etc., nucleic acids and polypeptides. These nucleic acids and 
polypeptides, as well as derivatives, homologs, analogs and fragments thereof, will 
hereinafter be collectively designated as "NOVX" nucleic acid, which represents the 

15 nucleotide sequence selected from the group consisting of SEQ ID NO: 2n-l, wherein n is 
an integer between 1 and 124, or polypeptide sequences, which represents the group 
consisting of SEQ ID NO: 2n, wherein n is an integer between 1 and 124. 

In one aspect, the invention provides an isolated polypeptide comprising a mature 
form of a NOVX amino acid. One example is a variant of a mature form of a NOVX 

20 amino acid sequence, wherein any amino acid in the mature form is changed to a different 
amino acid, provided that no more than 15% of the amino acid residues in the sequence of 
the mature form are so changed. The amino acid can be, for example, a NOVX amino acid 
sequence or a variant of a NOVX amino acid sequence, wherein any amino acid specified 
in the chosen sequence is changed to a different amino acid, provided that no more than 

25 15% of the amino acid residues in the sequence are so changed. The invention also 
includes fragments of any of these. In another aspect, the invention also includes an 
isolated nucleic acid that encodes a NOVX polypeptide, or a fragment, homolog, analog or 
derivative thereof. 

Also included in the invention is a NOVX polypeptide that is a naturally occurring 
30 allelic variant of a NOVX sequence. In one embodiment, the allelic variant includes an 
amino acid sequence that is the translation of a nucleic acid sequence differing by a single 
nucleotide from a NOVX nucleic acid sequence. In another embodiment, the NOVX 
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polypeptide is a variant polypeptide described therein, wHeffein^riy , kfrtift& j atM sptfcffied fh 
the chosen sequence is changed to provide a conservative substitution. In one embodiment, 
the invention discloses a method for determining the presence or amount of the NOVX 
polypeptide in a sample. The method involves the steps of: providing a sample; 
5 introducing the sample to an antibody that binds immunospecifically to the polypeptide; 
and determining the presence or amount of antibody bound to the NOVX polypeptide, 
thereby determining the presence or amount of the NOVX polypeptide in the sample. In 
another embodiment, the invention provides a method for determining the presence of or 
predisposition to a disease associated with altered levels of a NOVX polypeptide in a 

10 mammalian subject. This method involves the steps of: measuring the level of expression 
of the polypeptide in a sample from the first mammalian subject; and comparing the 
amount of the polypeptide in the sample of the first step to the amount of the polypeptide 
present in a control sample from a second mammalian subject known not to have, or not to 
be predisposed to, the disease, wherein an alteration in the expression level of the 

15 polypeptide in the first subject as compared to the control sample indicates the presence of 
or predisposition to the disease. 

In a further embodiment, the invention includes a method of identifying an agent 
that binds to a NOVX polypeptide. This method involves the steps of: introducing the 
polypeptide to the agent; and determining whether the agent binds to the polypeptide. In 

20 various embodiments, the agent is a cellular receptor or a downstream effector. 

In another aspect, the invention provides a method for identifying a potential 
therapeutic agent for use in treatment of a pathology, wherein the pathology is related to 
aberrant expression or aberrant physiological interactions of a NOVX polypeptide. The 
method involves the steps of: providing a cell expressing the NOVX polypeptide and 

25 having a property or function ascribable to the polypeptide; contacting the cell with a 
composition comprising a candidate substance; and determining whether the substance 
alters the property or function ascribable to the polypeptide; whereby, if an alteration 
observed in the presence of the substance is not observed when the cell is contacted with a 
composition devoid of the substance, the substance is identified as a potential therapeutic 

30 agent. In another aspect, the invention describes a method for screening for a modulator of 
activity or of latency or predisposition to a pathology associated with the NOVX 
polypeptide. This method involves the following steps: administering a test compound to a 
test animal at increased risk for a pathology associated with the NOVX polypeptide, 
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wherein the test animal recombinantly expresses the NOl^^o^yjpe^frB^ 
involves the steps of measuring the activity of the NOVX polypeptide in the test animal 
after administering the compound of step; and comparing the activity of the protein in the 
test animal with the activity of the NOVX polypeptide in a control animal not administered 

5 the polypeptide, wherein a change in the activity of the NOVX polypeptide in the test 
animal relative to the control animal indicates the test compound is a modulator of latency 
of, or predisposition to, a pathology associated with the NOVX polypeptide. In one 
embodiment, the test animal is a recombinant test animal that expresses a test protein 
transgene or expresses the transgene under the control of a promoter at an increased level 

10 relative to a wild-type test animal, and wherein the promoter is not the native gene 
promoter of the transgene. In another aspect, the invention includes a method for 
modulating the activity of the NOVX polypeptide, the method comprising introducing a 
cell sample expressing the NOVX polypeptide with a compound that binds to the 
polypeptide in an amount sufficient to modulate the activity of the polypeptide. 

15 The invention also includes an isolated nucleic acid that encodes a NOVX 

polypeptide, or a fragment, homolog, analog or derivative thereof. In a preferred 
embodiment, the nucleic acid molecule comprises the nucleotide sequence of a naturally 
occurring allelic nucleic acid variant. In another embodiment, the nucleic acid encodes a 
variant polypeptide, wherein the variant polypeptide has the polypeptide sequence of a 

20 naturally occurring polypeptide variant. In another embodiment, the nucleic acid molecule 
differs by a single nucleotide from a NOVX nucleic acid sequence. In one embodiment, 
the NOVX nucleic acid molecule hybridizes under stringent conditions to the nucleotide 
sequence selected from the group consisting of SEQ ID NO: 2n-l, wherein n is an integer 
between 1 and 124, or a complement of the nucleotide sequence. In another aspect, the 

25 invention provides a vector or a cell expressing a NOVX nucleotide sequence. 

In one embodiment, the invention discloses a method for modulating the activity of 
a NOVX polypeptide. The method includes the steps of: introducing a cell sample 
expressing the NOVX polypeptide with a compound that binds to the polypeptide in an 
amount sufficient to modulate the activity of the polypeptide. In another embodiment, the 

30 invention includes an isolated NOVX nucleic acid molecule comprising a nucleic acid 

sequence encoding a polypeptide comprising a NOVX amino acid sequence or a variant of 
a mature form of the NOVX amino acid sequence, wherein any amino acid in the mature 
form of the chosen sequence is changed to a different amino acid, provided that no more 
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than 15% of the amino acid residues in the sequence of tfiSfaafttfe fdffi! Jrf^stf changeS ^n' 
another embodiment, the invention includes an amino acid sequence that is a variant of the 
NOVX amino acid sequence, in which any amino acid specified in the chosen sequence is 
changed to a different amino acid, provided that no more than 15% of the amino acid 

5 residues in the sequence are so changed. 

In one embodiment, the invention discloses a NOVX nucleic acid fragment 
encoding at least a portion of a NOVX polypeptide or any variant of the polypeptide* 
wherein any amino acid of the chosen sequence is changed to a different amino acid, 
provided that no more than 10% of the amino acid residues in the sequence are so changed. 

10 In another embodiment, the invention includes the complement of any of the NOVX 
nucleic acid molecules or a naturally occurring allelic nucleic acid variant. In another 
embodiment, the invention discloses a NOVX nucleic acid molecule that encodes a variant 
polypeptide, wherein the variant polypeptide has the polypeptide sequence of a naturally 
occurring polypeptide variant. In another embodiment, the invention discloses a NOVX 

15 nucleic acid, wherein the nucleic acid molecule differs by a single nucleotide from a 
NOVX nucleic acid sequence. 

In another aspect, the invention includes a NOVX nucleic acid, wherein one or 
more nucleotides in the NOVX nucleotide sequence is changed to a different nucleotide 
provided that no more than 15% of the nucleotides are so changed. In one embodiment, the 

20 invention discloses a nucleic acid fragment of the NOVX nucleotide sequence and a 

nucleic acid fragment wherein one or more nucleotides in the NOVX nucleotide sequence 
is changed from that selected from the group consisting of the chosen sequence to a 
different nucleotide provided that no more than 15% of the nucleotides are so changed. In 
another embodiment, the invention includes a nucleic acid molecule wherein the nucleic 

25 acid molecule hybridizes under stringent conditions to a NOVX nucleotide sequence or a 
complement of the NOVX nucleotide sequence. In one embodiment, the invention 
includes a nucleic acid molecule, wherein the sequence is changed such that no more than 
15% of the nucleotides in the coding sequence differ from the NOVX nucleotide sequence 
or a fragment thereof. 

30 In a further aspect, the invention includes a method for determining the presence or 

amount of the NOVX nucleic acid in a sample. The method involves the steps of: 
providing the sample; introducing the sample to a probe that binds to the nucleic acid 
molecule; and determining the presence or amount of the probe bound to the NOVX 
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nucleic acid molecule, thereby determining the presence 6r fan&iAt't^ 

acid molecule in the sample. In one embodiment, the presence or amount of the nucleic 

acid molecule is used as a marker for cell or tissue type. 

In another aspect, the invention discloses a method for determining the presence of 

5 or predisposition to a disease associated with altered levels of the NOVX nucleic acid 

molecule of in a first mammalian subject. The method involves the steps of: measuring the 
amount of NOVX nucleic acid in a sample from the first mammalian subject; and 
comparing the amount of the nucleic acid in the sample of step (a) to the amount of NOVX 
nucleic acid present in a control sample from a second mammalian subject known not to 

10 have or not be predisposed to, the disease; wherein an alteration in the level of the nucleic 
acid in the first subject as compared to the control sample indicates the presence of or 
predisposition to the disease. 

Unless otherwise defined, all technical and scientific terms used herein have the 
same meaning as commonly understood by one of ordinary skill in the art to which this 

15 invention belongs. Although methods and materials similar or equivalent to those 

described herein can be used in the practice or testing of the present invention, suitable 
methods and materials are described below. All publications, patent applications, patents, 
and other references mentioned herein are incorporated by reference in their entirety. In 
the case of conflict, the present specification, including definitions, will control. In 

20 addition, the materials, methods, and examples are illustrative only and not intended to be 
limiting. 

Other features and advantages of the invention will be apparent from the following 
detailed description and claims. 

DETAILED DESCRIPTION OF THE INVENTION 

25 The present invention provides novel nucleotides and polypeptides encoded 

thereby. Included in the invention are the novel nucleic acid sequences, their encoded 
polypeptides, antibodies, and other related compounds. The sequences are collectively 
referred to herein as "NOVX nucleic acids" or "NOVX polynucleotides" and the 
corresponding encoded polypeptides are referred to as "NOVX polypeptides" or "NOVX 

30 proteins." Unless indicated otherwise, "NOVX" is meant to refer to any of the novel 

sequences disclosed herein. Table A provides a summary of the NOVX nucleic acids and 
their encoded polypeptides. 
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TABLE A. Sequences and Corresponding SEQ D^tin^r^ 



NOVX 
Assignment 


Internal 
Identification 


SEQID 
NO 

(nucleic 
acid) 


SEQID 
NO 

(amino 
acid) 


Homolopv 

llulilvl O J 


la 


CG106764-01 


1 


2 


Citron Kinase 


lb 


268667493 


3 


4 


RHO/RAC-Interacting Citron Kinase 


lc 


268667539 


5 


6 


RHO/RAC-Interacting Citron Kinase 


Id 


268667543 


7 


8 


RHO/RAC-Interacting Citron Kinase 


le 


268667555 


9 


10 


RHO/RAC-Interacting Citron Kinase 


If 


268667574 


11 


12 


RHO/RAC-Interacting Citron Kinase 


1* 


CG106764-02 


13 


14 


RHO/RAC-Interacting Citron Kinase 


2a 


CGI 17662-01 


15 


16 


Renal Renin Precursor 


2b 


CGI 17662-02 


17 


18 


Renal Renin Precursor 


3a 


CG118051-01 


19 


20 


Aldehyde Dehydrogenase 


3b 


CGI 1805 1-02 


21 


22 


Aldehyde Dehydrogenase 


3c 


CGI 1805 1-03 


23 


24 


Aldehyde Dehydrogenase 


4a 


CG120277-01 


25 


26 


Aldehyde Dehydrogenase-3 


4b 


CG120277-02 


27 


28 


Aldehyde Dehydrogenase-3 


5a 


CG140468-01 


29 


30 


Serine/Threonine-Protein Kinase PAK 1 


5b 


CG140468-02 


31 


32 


SerineTThreonine-Protein Kinase PAK 1 


6a 


CG142182-01 


DD 


34 


Ubiquitin Carboxyl-terminal Hydrolase 
15 


7a 


CG142564-01 


35 


36 


Carnitine O-Palmitoyltransferase I 


8a 


CG142797-01 


37 


38 


Cathepsin L 


9a 


CG143216-01 


39 


40 


Laminin Gamma 3 Chain Precursor 


10a 


CG143787-01 


41 


42 


Disintegrin Protease 


10b 


278889162 


43 


44 


Disintegrin Protease 


10c 


278689868 


45 


46 


Disintegrin Protease 


11a 


CG1441 12-01 


47 


48 


NEUROPSIN PRECURSOR like homo 
sapiens 


lib 


CG1441 12-04 


49 


50 


Neuropsin Precursor 


11c 


255501898 


51 


52 


Neuropsin Precursor 


lid 


255612524 


53 


54 


Neuropsin Precursor 


lie 


255612566 


55 


56 


Neuropsin Precursor 


llf 


306434072 


57 


58 


Neuropsin Precursor 


Hp 


CG1441 12-02 


59 


60 


Neuropsin Precursor 


llh 


CG1441 12-03 


61 


62 


Neuropsin Precursor 


12a 


CG144497-01 


63 


64 


Adenylosuccinate Synthetase Muscle 
Isozyme 


13a 


CG144686-01 


65 


66 


Mast Cell Carboxypeptidase A Precursor 


13b 


278690008 


67 


68 


Mast Cell Carboxypeptidase A Precursor 


13c 


278690035 


69 


70 


Mast Cell Carboxypeptidase A Precursor 


13d 


CG 144686-02 


71 


72 


Mast Cell Carboxypeptidase A Precursor 


14a 


CG144906-01 


73 


74 


Testisin Precursor 


14b 


CG144906-02 


75 


76 


Testisin Precursor 


15a 


CG144997-01 


77 


78 


RNaseHI 


15b 


278693648 


79 


80 


RNaseHI 


15c 


278480974 


81 


82 


RNaseHI 


15d 


278498047 


83 


84 


RNaseHI 
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15e 


CG144997-02 


85 


86 




16a 


CG145494-01 


87 


88 


PRESTIN 


17a 


CG145722-01 


89 


90 


WEE1 


18a 


CG145754-01 


91 


92 


Kallikrein 7 Precursor 


18b 


CG145754-03 


93 


94 


Kallikrein 7 Precursor 


18c 


CG145754-02 


95 


96 


Kallikrein 7 Precursor 


18d 


252718128 


97 


98 


Kallikrein 


18e 


252718152 


99 


100 


Kallikrein 


18f 


247856668 


101 


102 


Kallikrein 7 Precursor 


18g 


247856705 


103 


104 


Kallikrein 7 Precursor 


19a 


CG146279-01 


ins 


106 


Novel Potassium Channel Subfamily K 
Member 10 (TREK-2) 


20a 


CG146374-01 


107 


108 


Glycogen Branching Enzyme " 


21a 


CG146403-01 


109 


110 


Diacylglycerol Acyltransferase 2 


22a 


CG1465 13-01 


111 


112 


Diacylglycerol Acyltransferase 2 


23a 


CG146522-01 


113 


114 


Diacylglycerol Acyltransferase 2 


24a 


CG146531-01 


115 


116 


Diacylglycerol Acyltransferase 2 


25a 


CG147274-01 


117 


118 


Protease 


26a 


CG147351-01 


119 


120 


Testis-Development Related NYD-SP27 


27a 


CG147419-01 


lZi 


122 


Glutamine:Fructose-6-Phosphate 
Amidotransferase 1 Muscle Isoform 


28a 


CG148 102-01 


123 


124 


Carnitine O-Palmitoyltransferase 


28b 


CG148 102-02 


125 


126 


Carnitine O-Palmitoyltransferase 


29a 


CG148431-01 


127 


128 


Class II Aminotransferase 


29b 


CG148431-02 


129 


130 


Class II Aminotransferase 


30a 


CG148888-01 


131 


132 


GALNAC 4-Sulfotransferase 


31a 


CG149008-01 


133 


134 


Sodium/Hydrogen Exchanger 


32a 


CG149350-01 


135 


136 


Vacuolar ATP Synthase Subunit F 


32b 


CG149350-O2 


137 


138 


Vacuolar ATP Synthase Subunit F 


33a 


CG149463-01 


139 


140 


Serine/Threonme-Protein Kinase SGK 


34a 


CG149536-01 


141 


142 


Protein-Tyrosine Phosphatase, 
Non-Receptor Type 2 


35a 


CG149964-01 


143 


144 


Brain Mitochondrial Carrier Protein- 1 


35b 


309326356 


145 


146 


Brain Mitochondrial Carrier Protein- 1 


35c 


309326444 


147 


148 


Brain Mitochondrial Carrier Protein- 1 


35d 


309326473 


149 


150 


Brain Mitochondrial Carrier Protein-1 


35e 


CG149964-02 


151 


152 


Brain Mitochondrial Carrier Protein-1 


36a 


CG150306-01 


153 


154 


Dual Specificity Protein Phosphatase 5 


37a 


CG1505 10-01 


155 


156 


Human Alpha-2,3-Sialyltransferase 


38a 


CG150704-01 


ID / 


158 


Testis ecto-ADP-Ribosyltransferase 
Precursor 


39a 


CG150799-O1 


159 


160 


MASS1 


39b 


CG150799-02 


161 


162 


MASS1 


39c 


CG150799-03 


163 


164 


MASS1 


39d 


CG150799-01 


165 


166 


MASS1 


40a 


CG15 1014-01 


167 


168 


Metabotropic Glutamate Receptor 3 


40b 


CG151014-02 


169 


170 


Metabotropic Glutamate Receptor 3 


40c 


CG15 1014-03 


171 


172 


Metabotropic Glutamate Receptor 3 


41a 


CG151297-01 


173 


174 


Calmodulin-Dependent 
Phosphodiesterase 


41b 


CG15 1297-02 


175 


176 


Calmodulin-Dependent 
Phosphodiesterase 
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42a 


CGI 5 1822-01 


177 


178 


Methyltransferase 


42b 


CG151822-02 


179 


180 


Prenylcysteine Carboxyl 
Methyltransferase 


43a 


CG152256-01 


181 


182 


Phosphatidylserine Synthase 


44a 


CG171804-01 


183 


184 


N-Acetylgalactosaminide Alpha 2, 
6-Sialyltransferase 


45a 


CG171841-01 


185 


186 


Iron-Containing Alcohol Dehydrogenase 


46a 


CG173017-01 


187 


188 


Retinoic Acid Receptor RXR-Beta 


47a 


CG173347-01 


189 


190 


Serum Paraoxonase/Arylesterase 3 


48a 


CG56234-01 


191 


192 


Phosphoenolpyruvate Carboxykinase 2 
(PCK2) 


48b 


CG56234-02 


193 


194 


Phosphoenolpyruvate Carboxykinase 2 
(PCK2) 


49a 


CG56836-01 


195 


196 


Cathepsin B 


49b 


CG56836-02 


197 


198 


Cathepsin B 


49c 


CG56836-03 


199 


200 


Cathepsin B 


49d 


CG56836-04 


201 


202 


Cathepsin B 


49e 


247856403 


203 


204 


Cathepsin B 


49f 


247856434 


205 


206 


Cathepsin B 


49g 

— -— fe 


247856497 


207 


208 


Cathepsin B 


49h 


247856493 


209 


210 


Cathepsin B 


49i 


247856574 


211 


212 


Cathepsin B 


49j 


247856545 


213 


214 


Cathepsin B 


49k 


275480714 


215 


216 


Cathepsin B 


50a 


CG57284-01 


217 


218 


RAS-Related Protein RAB-5C 


50b 


CG57284-03 


219 


220 


RAS-Related Protein RAB-5C 


50c 


CG57284-02 


221 


222 


RAS-Related Protein RAB-5C 


51a 


CG57308-01 


223 


224 


Sulfonylurea Receptor 1 


51b 


CG57308-02 


225 


226 


Sulfonylurea Receptor 1 


52a 


CG93659-01 


227 . 


228 


Mitogen-Activated Protein Kinase 
Kinase Kinase 8 


52b 


CG93659-03 


229 


230 


Mitogen-Activated Protein Kinase 
Kinase Kinase 8 j 


52c 


CG93659-02 


231 


232 


Mitogen-Activated Protein Kinase 
Kinase Kinase 8 


53a 


CG94521-01 


233 


234 


Cytoplasmic Glycerol-3-Phosphate 
Dehydrogenase [NAD+] 


53b 


CG94521-03 


235 


236 


Cytoplasmic Glycerol-3-Phosphate 
Dehydrogenase [NAD+j 


53c 


CG94521-02 


237 


238 


Cytoplasmic Glycerol-3-Phosphate 
Dehydrogenase [NAD+l 


54a 


CG96613-01 


239 


240 


Pyruvate Dehydrogenase Kinase (PDK1) 


54b 


CG96613-03 


241 


242 


Pyruvate Dehydrogenase Kinase (PDK1) 


54c 


CG96613-02 


243 


244 


Pyruvate Dehydrogenase Kinase (PDK1) 


55a 


CG96736-01 


245 


246 


Neutral Amino Acid Transporter B 


55b 


CG96736-02 


247 


248 


Neutral Amino Acid Transporter B j 



5 



Table A indicates the homology of NOVX polypeptides to known protein families. 
Thus, the nucleic acids and polypeptides, antibodies and related compounds according to 
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the invention corresponding to a NOVX as identified in tfoltohrf f o^bM^^ri^^UfSful" 
in therapeutic and diagnostic applications implicated in, for example, pathologies and 
disorders associated with the known protein families identified in column 5 of Table A. 
Pathologies, diseases, disorders and condition and the like that are associated with 
5 NOVX sequences include, but are not limited to: e.g., cardiomyopathy, atherosclerosis, 
hypertension, congenital heart defects, aortic stenosis, atrial septal defect (ASD), 
atrioventricular (A-V) canal defect, ductus arteriosus, pulmonary stenosis, subaortic 
stenosis, ventricular septal defect (VSD), valve diseases, tuberous sclerosis, scleroderma, 
obesity, metabolic disturbances associated with obesity, transplantation, 

10 adrenoleukodystrophy, congenital adrenal hyperplasia, prostate cancer, diabetes, metabolic 
disorders, neoplasm; adenocarcinoma, lymphoma, uterus cancer, fertility, hemophilia, 
hypercoagulation, idiopathic thrombocytopenic purpura, immunodeficiencies, graft versus 
host disease, AIDS, bronchial asthma, Crohn's disease; multiple sclerosis, treatment of 
Albright Hereditary Ostoeodystrophy, infectious disease, anorexia, cancer-associated 

15 cachexia, cancer, neurodegenerative disorders, Alzheimer's Disease, Parkinson's Disorder, 
immune disorders, hematopoietic disorders, and the various dyslipidemias, the metabolic 
syndrome X and wasting disorders associated with chronic diseases and various cancers, as 
well as conditions such as transplantation and fertility. 

NOVX nucleic acids and their encoded polypeptides are useful in a variety of 

20 applications and contexts. The various NOVX nucleic acids and polypeptides according to 
the invention are useful as novel members of the protein families according to the presence 
of domains and sequence relatedness to previously described proteins. Additionally, 
NOVX nucleic acids and polypeptides can also be used to identify proteins that are 
members of the family to which the NOVX polypeptides belong. 

25 Consistent with other known members of the family of proteins, identified in 

column 5 of Table A, the NOVX polypeptides of the present invention show homology to, 
and contain domains that are characteristic of, other members of such protein families. 
Details of the sequence relatedness and domain analysis for each NOVX are presented in 
Example A. 

30 The NOVX nucleic acids and polypeptides can also be used to screen for molecules, 

which inhibit or enhance NOVX activity or function. Specifically, the nucleic acids and 
polypeptides according to the invention may be used as targets for the identification of 
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small molecules that modulate or inhibit diseases associ^ed'^ftH'tff^ 1'Md' 
in Table A. 

The NOVX nucleic acids and polypeptides are also useful for detecting specific cell 
types. Details of the expression analysis for each NOVX are presented in Example C. 
5 Accordingly, the NOVX nucleic acids, polypeptides, antibodies and related compounds 
according to the invention will have diagnostic and therapeutic applications in the detection 
of a variety of diseases with differential expression in normal vs. diseased tissues, e.g. 
detection of a variety of cancers. 

Additional utilities for NOVX nucleic acids and polypeptides according to the 
10 invention are disclosed herein. 

NOVX clones 

NOVX nucleic acids and their encoded polypeptides are useful in a variety of 
applications and contexts. The various NOVX nucleic acids and polypeptides according to 
the invention are useful as novel members of the protein families according to the presence 

15 of domains and sequence relatedness to previously described proteins. Additionally, 
NOVX nucleic acids and polypeptides can also be used to identify proteins that are 
members of the family to which the NOVX polypeptides belong. 

The NOVX genes and their corresponding encoded proteins are useful for 
preventing, treating or ameliorating medical conditions, e.g., by protein or gene therapy. 

20 Pathological conditions can be diagnosed by determining the amount of the new protein in 
a sample or by determining the presence of mutations in the new genes. Specific uses are 
described fov each of the NOVX genes, based on the tissues in which they are most highly 
expressed. Uses include developing products for the diagnosis or treatment of a variety of 
diseases and disorders. 

25 The NOVX nucleic acids and proteins of the invention are useful in potential 

diagnostic and therapeutic applications and as a research tool. These include serving as a 
specific or selective nucleic acid or protein diagnostic and/or prognostic marker, wherein 
the presence or amount of the nucleic acid or the protein are to be assessed, as well as 
potential therapeutic applications such as the following: (i) a protein therapeutic, (ii) a 

30 small molecule drug target, (iii) an antibody target (therapeutic, diagnostic, drug 

targeting/cytotoxic antibody), (iv) a nucleic acid useful in gene therapy (gene delivery/gene 
ablation), and (v) a composition promoting tissue regeneration in vitro and in vivo (vi) a 
biological defense weapon. 
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In one specific embodiment, the invention included to iMatM^dljp^'tid'd 1 
comprising an amino acid sequence selected from the group consisting of: (a) a mature 
form of the amino acid sequence selected from the group consisting of SEQ ID NO: 2n, 
wherein n is an integer between 1 and 124; (b) a variant of a mature form of the amino acid 
5 sequence selected from the group consisting of SEQ ID NO: 2n, wherein n is an integer 
between 1 and 124, wherein any amino acid in the mature form is changed to a different 
amino acid, provided that no more than 15% of the amino acid residues in the sequence of 
the mature form are so changed; (c) an amino acid sequence selected from the group 
consisting of SEQ ID NO: 2n, wherein n is an integer between 1 and 124; (d) a variant of 

10 the amino acid sequence selected from the group consisting of SEQ ID NO:2n, wherein n is 
an integer between 1 and 124 wherein any amino acid specified in the chosen sequence is 
changed to a different amino acid, provided that no more than 15% of the amino acid 
residues in the sequence are so changed; and (e) a fragment of any of (a) through (d). 
In another specific embodiment, the invention includes an isolated nucleic acid 

15 molecule comprising a nucleic acid sequence encoding a polypeptide comprising an amino 
acid sequence selected from the group consisting of: (a) a mature form of the amino acid 
sequence given SEQ ID NO: 2n, wherein n is an integer between 1 and 124; (b) a variant of 
a mature form of the amino acid sequence selected from the group consisting of SEQ BD 
NO: 2n, wherein n is an integer between 1 and 124 wherein any amino acid in the mature 

20 form of the chosen sequence is changed to a different amino acid, provided that no more 
than 15% of the amino acid residues in the sequence of the mature form are so changed; (c) 
the amino acid sequence selected from the group consisting of SEQ ID NO: 2n, wherein n 
is an integer between 1 and 124; (d) a variant of the amino acid sequence selected from the 
group consisting of SEQ ID NO: 2n, wherein n is an integer between 1 and 124, in which 

25 any amino acid specified in the chosen sequence is changed to a different amino acid, 

provided that no more than 15% of the amino acid residues in the sequence are so changed; 
(e) a nucleic acid fragment encoding at least a portion of a polypeptide comprising the 
amino acid sequence selected from the group consisting of SEQ ID NO: 2n, wherein n is an 
integer between 1 and 124 or any variant of said polypeptide wherein any amino acid of the 

30 chosen sequence is changed to a different amino acid, provided that no more than 10% of 
the amino acid residues in the sequence are so changed; and (f) the complement of any of 
said nucleic acid molecules. 
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In yet another specific embodiment, the inventioiTirfduites M iAlfiUlidtidbl^ 
molecule, wherein said nucleic acid molecule comprises a nucleotide sequence selected 
from the group consisting of: (a) the nucleotide sequence selected from the group 
consisting of SEQ ID NO: 2n-l, wherein n is an integer between 1 and 124; (b) a 
5 nucleotide sequence wherein one or more nucleotides in the nucleotide sequence selected 
from the group consisting of SEQ ID NO: 2n-l, wherein n is an integer between 1 and 124 
is changed from that selected from the group consisting of the chosen sequence to a 
different nucleotide provided that no more than 15% of the nucleotides are so changed; 
(c) a nucleic acid fragment of the sequence selected from the group consisting of SEQ ID 
10 NO: 2n-l, wherein n is an integer between 1 and 124; and (d) a nucleic acid fragment 
wherein one or more nucleotides in the nucleotide sequence selected from the group 
consisting of SEQ ID NO: 2n-l, wherein n is an integer between 1 and 124 is changed 
from that selected from the group consisting of the chosen sequence to a different 
nucleotide provided that no more than 15% of the nucleotides are so changed. 

15 NOVX Nucleic Acids and Polypeptides 

One aspect of the invention pertains to isolated nucleic acid molecules that encode 
NOVX polypeptides or biologically active portions thereof. Also included in the invention 
are nucleic acid fragments sufficient for use as hybridization probes to identify 
NOVX-encoding nucleic acids (e.g., NOVX mRNAs) and fragments for use as PCR 

20 primers for the amplification and/or mutation of NOVX nucleic acid molecules. As used 
herein, the term "nucleic acid molecule" is intended to include DNA molecules (e.g., 
cDNA or genomic DNA), RNA molecules (e.g., mRNA), analogs of the DNA or RNA 
generated using nucleotide analogs, and derivatives, fragments and homologs thereof. The 
nucleic acid molecule may be single-stranded or double-stranded, but preferably is 

25 comprised double-stranded DNA. 

A NOVX nucleic acid can encode a mature NOVX polypeptide. As used herein, a 
"mature" form of a polypeptide or protein disclosed in the present invention is the product 
of a naturally occurring polypeptide or precursor form or proprotein. The naturally 
occurring polypeptide, precursor or proprotein includes, by way of nonlimiting example, 

30 the full-length gene product encoded by the corresponding gene. Alternatively, it may be 
defined as the polypeptide, precursor or proprotein encoded by an ORF described herein. 
The product "mature" form arises, by way of nonlimiting example, as a result of one or 
more naturally occurring processing steps that may take place within the cell (e.g., host 
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cell) in which the gene product arises. Examples of such u processing sf^^eadingloV* 
"mature" form of a polypeptide or protein include the cleavage of the N-terminal 
methionine residue encoded by the initiation codon of an ORF, or the proteolytic cleavage 
of a signal peptide or leader sequence. Thus a mature form arising from a precursor 
5 polypeptide or protein that has residues 1 to N, where residue 1 is the N-terminal 

methionine, would have residues 2 through N remaining after removal of the N-terminal 
methionine. Alternatively, a mature form arising from a precursor polypeptide or protein 
having residues 1 to N, in which an N-terminal signal sequence from residue 1 to residue M 
is cleaved, would have the residues from residue M+l to residue N remaining. Further as 

10 used herein, a "mature" form of a polypeptide or protein may arise from a step of 

post-translational modification other than a proteolytic cleavage event. Such additional 
processes include, by way of non-limiting example, glycosylation, myristylation or 
phosphorylation. In general, a mature polypeptide or protein may result from the operation 
of only one of these processes, or a combination of any of them. 

15 The term "probe", as utilized herein, refers to nucleic acid sequences of variable 

length, preferably between at least about 10 nucleotides (nt), about 100 nt, or as many as 
approximately, e.g., 6,000 nt, depending upon the specific use. Probes are used in the 
detection of identical, similar, or complementary nucleic acid sequences. Longer length 
probes are generally obtained from a natural or recombinant source, are highly specific, and 

20 much slower to hybridize than shorter-length oligomer probes. Probes may be single- 
stranded or double-stranded and designed to have specificity in PCR, membrane-based 
hybridization technologies, or ELISA-like technologies. 

The term "isolated" nucleic acid molecule, as used herein, is a nucleic acid that is 
separated from other nucleic acid molecules which are present in the natural source of the 

25 nucleic acid. Preferably, an "isolated" nucleic acid is free of sequences which naturally 
flank the nucleic acid (ie. 9 sequences located at the 5'- and 3-termini of the nucleic acid) in 
the genomic DNA of the organism from which the nucleic acid is derived. For example, in 
various embodiments, the isolated NOVX nucleic acid molecules can contain less than 
about 5 kb, 4 kb, 3 kb, 2 kb, 1 kb, 0.5 kb or 0.1 kb of nucleotide sequences which naturally 

30 flank the nucleic acid molecule in genomic DNA of the cell/tissue from which the nucleic 
acid is derived (e.g., brain, heart, liver, spleen, etc.). Moreover, an "isolated" nucleic acid 
molecule, such as a cDNA molecule, can be substantially free of other cellular material, or 
culture medium, or of chemical precursors or other chemicals. 
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A nucleic acid molecule of the invention, e.g., a tfucT£i<? &id M ttt!lB£^having ffie 
nucleotide sequence of SEQ ID NO:2n-l, wherein n is an integer between 1 and 124, or a 
complement of this nucleotide sequence, can be isolated using standard molecular biology 
techniques and the sequence information provided herein. Using all or a portion of the 
5 nucleic acid sequence of SEQ ID NO:2n-l, wherein n is an integer between 1 and 124, as a 
hybridization probe, NOVX molecules can be isolated using standard hybridization and 
cloning techniques (e.g., as described in Sambrook, et al. y (eds.), Molecular Cloning: A 
Laboratory Manual 2 nd Ed., Cold Spring Harbor Laboratory Press, Cold Spring Harbor, 
NY, 1989; and Ausubel, et al, (eds.), Current Protocols in Molecular Biology, John 

10 Wiley & Sons, New York, NY, 1993.) 

A nucleic acid of the invention can be amplified using cDNA, mRNA or 
alternatively, genomic DNA, as a template with appropriate oligonucleotide primers 
according to standard PCR amplification techniques. The nucleic acid so amplified can be 
cloned into an appropriate vector and characterized by DNA sequence analysis. 

15 Furthermore, oligonucleotides corresponding to NOVX nucleotide sequences can be 
prepared by standard synthetic techniques, e.g., using an automated DNA synthesizer. 

As used herein, the term "oligonucleotide" refers to a series of linked nucleotide 
residues. A short oligonucleotide sequence may be based on, or designed from, a genomic 
or cDNA sequence and is used to amplify, confirm, or reveal the presence of an identical, 

20 similar or complementary DNA or RNA in a particular cell or tissue. Oligonucleotides 
comprise a nucleic acid sequence having about 10 nt, 50 nt, or 100 nt in length, preferably 
about 15 nt to 30 nt in length. In one embodiment of the invention, an oligonucleotide 
comprising a nucleic acid molecule less than 100 nt in length would further comprise at 
least 6 contiguous nucleotides of SEQ ED NO:2n-l, wherein n is an integer between 1 and 

25 124, or a complement thereof. Oligonucleotides may be chemically synthesized and may 
also be used as probes. 

In another embodiment, an isolated nucleic acid molecule of the invention 
comprises a nucleic acid molecule that is a complement of the nucleotide sequence shown 
in SEQ ID NO:2/z-l, wherein n is an integer between 1 and 124, or a portion of this 

30 nucleotide sequence (e.g., a fragment that can be used as a probe or primer or a fragment 
encoding a biologically-active portion of a NOVX polypeptide). A nucleic acid molecule 
that is complementary to the nucleotide sequence of SEQ ID NO:2n-l, wherein n is an 
integer between 1 and 124, is one that is sufficiently complementary to the nucleotide 
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sequence of SEQ ID NO:2??-l, wherein n is an integer between VanTtf r24;W ifCSff" 
hydrogen bond with few or no mismatches to the nucleotide sequence shown in SEQ ID 
NO:2ri-l, wherein n is an integer between 1 and 124, thereby forming a stable duplex. 

As used herein, the term "complementary" refers to Watson-Crick or Hoogsteen 
base pairing between nucleotides units of a nucleic acid molecule, and the term "binding" 
means the physical or chemical interaction between two polypeptides or compounds or 
associated polypeptides or compounds or combinations thereof Binding includes ionic, 
non-ionic, van der Waals, hydrophobic interactions, and the like. A physical interaction 
can be either direct or indirect. Indirect interactions may be through or due to the effects of 
another polypeptide or compound. Direct binding refers to interactions that do not take 
place through, or due to, the effect of another polypeptide or compound, but instead are 
without other substantial chemical intermediates. 

A "fragment" provided herein is defined as a sequence of at least 6 (contiguous) 
nucleic acids or at least 4 (contiguous) amino acids, a length sufficient to allow for specific 
hybridization in the case of nucleic acids or for specific recognition of an epitope in the 
case of amino acids, and is at most some portion less than a full length sequence. 
Fragments may be derived from any contiguous portion of a nucleic acid or amino acid 
sequence of choice. 

A full-length NOVX clone is identified as containing an ATG translation start 
codon and an in-frame stop codon. Any disclosed NOVX nucleotide sequence lacking an 
ATG start codon therefore encodes a truncated C-terminal fragment of the respective 
NOVX polypeptide, and requires that the corresponding full-length cDNA extend in the 5' 
direction of the disclosed sequence. Any disclosed NOVX nucleotide sequence lacking an 
in-frame stop codon similarly encodes a truncated N-terminal fragment of the respective 
NOVX polypeptide, and requires that the corresponding full-length cDNA extend in the 3' 
direction of the disclosed sequence. 

A "derivative" is a nucleic acid sequence or amino acid sequence formed from the 
native compounds either directly, by modification or partial substitution. An "analog" is a 
nucleic acid sequence or amino acid sequence that has a structure similar to, but not 
identical to, the native compound, e.g. they differs from it in respect to certain components 
or side chains. Analogs may be synthetic or derived from a different evolutionary origin 
and may have a similar or opposite metabolic activity compared to wild type. A 
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"homolog" is a nucleic acid sequence or amino acid seqiifenflS dfa pTMrciiTar gene"tKatls 
derived from different species. 

Derivatives and analogs may be full length or other than full length. Derivatives or 
analogs of the nucleic acids or proteins of the invention include, but are not limited to, 
5 molecules comprising regions that are substantially homologous to the nucleic acids or 
proteins of the invention, in various embodiments, by at least about 70%, 80%, or 95% 
identity (with a preferred identity of 80-95%) over a nucleic acid or amino acid sequence of 
identical size or when compared to an aligned sequence in which the alignment is done by a 
computer homology program known in the art, or whose encoding nucleic acid is capable 
10 of hybridizing to the complement of a sequence encoding the proteins under stringent, 
moderately stringent, or low stringent conditions. See e.g. Ausubel, et aL, Current 
Protocols in Molecular Biology, John Wiley & Sons, New York, NY, 1993, and 
below. 

A "homologous nucleic acid sequence" or "homologous amino acid sequence," or 

15 variations thereof, refer to sequences characterized by a homology at the nucleotide level or 
amino acid level as discussed above. Homologous nucleotide sequences include those 
sequences coding for isofoims of NOVX polypeptides. Isoforms can be expressed in 
different tissues of the same organism as a result of, for example, alternative splicing of 
RNA. Alternatively, isoforms can be encoded by different genes. In the invention, 

20 homologous nucleotide sequences include nucleotide sequences encoding for a NOVX 
polypeptide of species other than humans, including, but not limited to: vertebrates, and 
thus can include, e.g., frog, mouse, rat, rabbit, dog, cat cow, horse, and other organisms. 
Homologous nucleotide sequences also include, but are not limited to, naturally occurring 
allelic variations and mutations of the nucleotide sequences set forth herein. A homologous 

25 nucleotide sequence does not, however, include the exact nucleotide sequence encoding 
human NOVX protein. Homologous nucleic acid sequences include those nucleic acid 
sequences that encode conservative amino acid substitutions (see below) in SEQ ID 
NO:2n-l, wherein n is an integer between 1 and 124, as well as a polypeptide possessing 
NOVX biological activity. Various biological activities of the NOVX proteins are 

30 described below. 

A NOVX polypeptide is encoded by the open reading frame ("ORF*) of a NOVX 
nucleic acid. An ORF corresponds to a nucleotide sequence that could potentially be 
translated into a polypeptide. A stretch of nucleic acids comprising an ORF is 
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uninterrupted by a stop codon. An UKh that represents tne eocnifg sequence ror ariun 
protein begins with an ATG "start" codon and terminates with one of the three "stop" 
codons, namely, TAA, TAG, or TGA. For the purposes of this invention, an ORF may be 
any part of a coding sequence, with or without a start codon, a stop codon, or both. For an 
5 ORF to be considered as a good candidate for coding for a bona fide cellular protein, a 
minimum size requirement is often set, e.g., a stretch of DNA that would encode a protein 
of 50 amino acids or more. 

The nucleotide sequences determined from the cloning of the human NOVX genes 
allows for the generation of probes and primers designed for use in identifying and/or 
10 cloning NOVX homologues in other cell types, e.g. from other tissues, as well as NOVX 
homologues from other vertebrates. The probe/primer typically comprises substantially 
purified oligonucleotide. The oligonucleotide typically comprises a region of nucleotide 
sequence that hybridizes under stringent conditions to at least about 12, 25, 50, 100, 150, 
200, 250, 300, 350 or 400 consecutive sense strand nucleotide sequence of SEQ ID 
15 NO:2n-l, wherein n is an integer between 1 and 124; or an anti-sense strand nucleotide 
sequence of SEQ ID NO:2n-l, wherein n is an integer between 1 and 124; or of a naturally 
occurring mutant of SEQ ID NO:2n-l, wherein n is an integer between 1 and 124. 

Probes based on the human NOVX nucleotide sequences can be used to detect 
transcripts or genomic sequences encoding the same or homologous proteins. In various 
20 embodiments, the probe has a detectable label attached, e.g. the label can be a radioisotope, 
a fluorescent compound, an enzyme, or an enzyme co-factor. Such probes can be used as a 
part of a diagnostic test kit for identifying cells or tissues which mis-express a NOVX 
protein, such as by measuring a level of a NOVX-encoding nucleic acid in a sample of cells 
from a subject e.g., detecting NOVX mRNA levels or determining whether a genomic 
25 NOVX gene has been mutated or deleted. 

"A polypeptide having a biologically-active portion of a NOVX polypeptide" refers 
to polypeptides exhibiting activity similar, but not necessarily identical to, an activity of a 
polypeptide of the invention, including mature forms, as measured in a particular biological 
assay, with or without dose dependency. A nucleic acid fragment encoding a 
30 "biologically-active portion of NOVX" can be prepared by isolating a portion of SEQ ID 
NO:2n-l, wherein n is an integer between 1 and 124, that encodes a polypeptide having a 
NOVX biological activity (the biological activities of the NOVX proteins are described 
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below), expressing the encoded portion of NOVX pateitfl/tg.Jty rfeSttfiibi^M expression 
in vitro) and assessing the activity of the encoded portion of NOVX. 

NOVX Nucleic Acid and Polypeptide Variants 

The invention further encompasses nucleic acid molecules that differ from the 
5 nucleotide sequences of SEQ ID NO:2w-l, wherein n is an integer between 1 and 124, due 
to degeneracy of the genetic code and thus encode the same NOVX proteins as that 
encoded by the nucleotide sequences of SEQ ID NO:2n-l, wherein n is an integer between 
1 and 124. In another embodiment, an isolated nucleic acid molecule of the invention has a 
nucleotide sequence encoding a protein having an amino acid sequence of SEQ ID NO:2/i, 

10 wherein n is an integer between 1 and 124. 

In addition to the human NOVX nucleotide sequences of SEQ ID NO:2/i-l, wherein 
n is an integer between 1 and 124, it will be appreciated by those skilled in the art that 
DNA sequence polymorphisms that lead to changes in the amino acid sequences of the 
NOVX polypeptides may exist within a population (e.g., the human population). Such 

15 genetic polymorphism in the NOVX genes may exist among individuals within a 
population due to natural allelic variation. As used herein, the terms "gene" and 
"recombinant gene" refer to nucleic acid molecules comprising an open reading frame 
(ORF) encoding a NOVX protein, preferably a vertebrate NOVX protein. Such natural 
allelic variations can typically result in 1-5% variance in the nucleotide sequence of the 

20 NOVX genes. Any and all such nucleotide variations and resulting amino acid 

polymorphisms in the NOVX polypeptides, which are the result of natural allelic variation 
and that do not alter the functional activity of the NOVX polypeptides, are intended to be 
within the scope of the invention. 

Moreover, nucleic acid molecules encoding NOVX proteins from other species, and 

25 thus that have a nucleotide sequence that differs from a human SEQ ID NO:2n-l , wherein n 
is an integer between 1 and 124, are intended to be within the scope of the invention. 
Nucleic acid molecules corresponding to natural allelic variants and homologies of the 
NOVX cDNAs of the invention can be isolated based on their homology to the human 
NOVX nucleic acids disclosed herein using the human cDNAs, or a portion thereof, as a 

30 hybridization probe according to standard hybridization techniques under stringent 
hybridization conditions. 

Accordingly, in another embodiment, an isolated nucleic acid molecule of the 
invention is at least 6 nucleotides in length and hybridizes under stringent conditions to the 
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nucleic acid molecule comprising the nucleotide sequendETdF SBQ ID NDS9T-1, tttetert n 
is an integer between 1 and 124. In another embodiment, the nucleic acid is at least 10, 25, 
50, 100, 250, 500, 750, 1000, 1500, or 2000 or more nucleotides in length. In yet another 
embodiment, an isolated nucleic acid molecule of the invention hybridizes to the coding 

5 region. As used herein, the term "hybridizes under stringent conditions" is intended to 
describe conditions for hybridization and washing under which nucleotide sequences at 
least about 65% homologous to each other typically remain hybridized to each other. 

Homologs (i.e., nucleic acids encoding NOVX proteins derived from species other 
than human) or other related sequences (e.g., paralogs) can be obtained by low, moderate or 

10 high stringency hybridization with all or a portion of the particular human sequence as a 
probe using methods well known in the art for nucleic acid hybridization and cloning. 

As used herein, the phrase "stringent hybridization conditions" refers to conditions 
under which a probe, primer or oligonucleotide will hybridize to its target sequence, but to 
no other sequences. Stringent conditions are sequence-dependent and will be different in 

15 different circumstances. Longer sequences hybridize specifically at higher temperatures 
than shorter sequences. Generally, stringent conditions are selected to be about 5 °C lower 
than the thermal melting point (Tm) for the specific sequence at a defined ionic strength 
and pH. The Tm is the temperature (under defined ionic strength, pH and nucleic acid 
concentration) at which 50% of the probes complementary to the target sequence hybridize 

20 to the target sequence at equilibrium. Since the target sequences are generally present at 
excess, at Tm, 50% of the probes are occupied at equilibrium. Typically, stringent 
conditions will be those in which the salt concentration is less than about 1.0 M sodium ion, 
typically about 0.01 to 1 .0 M sodium ion (or other salts) at pH 7.0 to 8.3 and the 
temperature is at least about 30 °C for short probes, primers or oligonucleotides (e.g., 10 nt 

25 to 50 nt) and at least about 60 °C for longer probes, primers and oligonucleotides. 

Stringent conditions may also be achieved with the addition of destabilizing agents, such as 
formamide. 

Stringent conditions are known to those skilled in the art and can be found in 
Ausubel, et aU (eds.), Current Protocols in Molecular Biology, John Wiley & Sons, 
30 N.Y. (1989), 6.3.1-6.3.6. Preferably, the conditions are such that sequences at least about 
65%, 70%, 75%, 85%, 90%, 95%, 98%, or 99% homologous to each other typically remain 
hybridized to each other. A non-limiting example of stringent hybridization conditions are 
hybridization in a high salt buffer comprising 6X SSC, 50 mM Tris-HCl (pH 7.5), 1 mM 
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EDTA, 0.02% PVP, 0.02% Ficoll, 0.02% BSA, and 500fng«nf crenWe^l ,,T ^II^loi^1^ife^^h■ ,,, 
DNA at 65°C, followed by one or more washes in 0.2X SSC, 0.01% BSA at 50°C. An 
isolated nucleic acid molecule of the invention that hybridizes under stringent conditions to 
a sequence of SEQ ED NO:2n-l, wherein n is an integer between 1 and 124, corresponds to 
5 a naturally-occurring nucleic acid molecule. As used herein, a "naturally-occurring" 
nucleic acid molecule refers to an RNA or DNA molecule having a nucleotide sequence 
that occurs in nature (e.g., encodes a natural protein). 

In a second embodiment, a nucleic acid sequence that is hybridizable to the nucleic 
acid molecule comprising the nucleotide sequence of SEQ ID NO:2n-l, wherein n is an 

10 integer between 1 and 124, or fragments, analogs or derivatives thereof, under conditions of 
moderate stringency is provided. A non-limiting example of moderate stringency 
hybridization conditions are hybridization in 6X SSC, 5X Reinhardt's solution, 0.5% SDS 
and 100 mg/ml denatured salmon sperm DNA at 55 °C, followed by one or more washes in 
IX SSC, 0.1% SDS at 37 °C. Other conditions of moderate stringency that may be used 

15 are well-known within the art. See, e.g., Ausubel, et al (eds.), 1993, Current Protocols 
in Molecular Biology, John Wiley & Sons, NY, and Krieger, 1990; Gene Transfer 
and Expression, A Laboratory Manual, Stockton Press, NY. 

In a third embodiment, a nucleic acid that is hybridizable to the nucleic acid 
molecule comprising the nucleotide sequences of SEQ ID NO:2n-l, wherein n is an integer 

20 between 1 and 124, or fragments, analogs or derivatives thereof, under conditions of low 
stringency, is provided. A non-limiting example of low stringency hybridization conditions 
are hybridization in 35% formamide, 5X SSC, 50 mM Tris-HCl (pH 7.5), 5 mM EDTA, 
0.02% PVP, 0.02% Ficoll, 0.2% BSA, 100 mg/ml denatured salmon spenn DNA, 10% 
(wt/vol) dextran sulfate at 40°C, followed by one or more washes in 2X SSC, 25 mM 

25 Tris-HCl (pH 7.4), 5 mM EDTA, and 0.1% SDS at 50°C. Other conditions of low 

stringency that may be used are well known in the art {e.g., as employed for cross-species 
hybridizations). See, e.g., Ausubel, et al. (eds.), 1993, Current Protocols in 
Molecular Biology, John Wiley & Sons, NY, and Kriegler, 1990, Gene Transfer and 
Expression, A Laboratory Manual, Stockton Press, NY; Shilo and Weinberg, 1981. 

30 Proc Natl Acad Sci USA 78: 6789-6792. 

Conservative Mutations 

In addition to naturally-occurring allelic variants of NOVX sequences that may 
exist in the population, the skilled artisan will further appreciate that changes can be 
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introduced by mutation into the nucleotide sequences of SE^lDl^<W2tf-lT WprtTn* ffdh 
integer between 1 and 124, thereby leading to changes in the amino acid sequences of the 
encoded NOVX protein, without altering the functional ability of that NOVX protein. For 
example, nucleotide substitutions leading to amino acid substitutions at "non-essential" 

5 amino acid residues can be made in the sequence of SEQ ID NO:2n, wherein n is an integer 
between 1 and 124. A "non-essential" amino acid residue is a residue that can be altered 
from the wild-type sequences of the NOVX proteins without altering their biological 
activity, whereas an "essential" amino acid residue is required for such biological activity. 
For example, amino acid residues that are conserved among the NOVX proteins of the 

10 invention are predicted to be particularly non-amenable to alteration. Amino acids for 
which conservative substitutions can be made are well-known within the art. 

Another aspect of the invention pertains to nucleic acid molecules encoding NOVX 
proteins that contain changes in amino acid residues that are not essential for activity. Such 
NOVX proteins differ in amino acid sequence from SEQ ID NO:2rc-l, wherein n is an 

15 integer between 1 and 124, yet retain biological activity. In one embodiment, the isolated 
nucleic acid molecule comprises a nucleotide sequence encoding a protein, wherein the 
protein comprises an amino acid sequence at least about 40% homologous to the amino 
acid sequences of SEQ ID NO:2w, wherein n is an integer between 1 and 124. Preferably, 
the protein encoded by the nucleic acid molecule is at least about 60% homologous to SEQ 

20 ID NO:2;i, wherein n is an integer between 1 and 124; more preferably at least about 70% 
homologous to SEQ ID NO:2/i, wherein n is an integer between 1 and 124; still more 
preferably at least about 80% homologous to SEQ ID NO:2n, wherein n is an integer 
between 1 and 124; even more preferably at least about 90% homologous to SEQ ID 
NO:2n, wherein n is an integer between 1 and 124; and most preferably at least about 95% 

25 homologous to SEQ ID NO:2n, wherein n is an integer between 1 and 124 ! . 

An isolated nucleic acid molecule encoding a NOVX protein homologous to the 
protein of SEQ ID NO:2n, wherein n is an integer between 1 and 124, can be created by 
introducing one or more nucleotide substitutions, additions or deletions into the nucleotide 
sequence of SEQ ID NO:2n-l, wherein n is an integer between 1 and 124, such that one or 

30 more amino acid substitutions, additions or deletions are introduced into the encoded 
protein. 

Mutations can be introduced any one of SEQ ID NO:2w-l, wherein n is an integer 
between 1 and 124, by standard techniques, such as site-directed mutagenesis and 
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PCR-mediated mutagenesis. Preferably, conservative aniintTadid stiBMtMKM afiff jffi'atfe'al 
one or more predicted, non-essential amino acid residues. A "conservative amino acid 
substitution" is one in which the amino acid residue is replaced with an amino acid residue 
having a similar side chain. Families of amino acid residues having similar side chains 
5 have been defined within the art. These families include amino acids with basic side chains 
(e.g., lysine, arginine, histidine), acidic side chains {e.g., aspartic acid, glutamic acid), 
uncharged polar side chains (e.g., glycine, asparagine, glutamine, serine, threonine, 
tyrosine, cysteine), nonpolar side chains (e.g., alanine, valine, leucine, isoleucine, proline, 
phenylalanine, methionine, tryptophan), beta-branched side chains (e.g., threonine, valine, 

10 isoleucine) and aromatic side chains (e.g., tyrosine, phenylalanine, tryptophan, histidine). 
Thus, a predicted non-essential amino acid residue in the NOVX protein is replaced with 
another amino acid residue from the same side chain family. Alternatively, in another 
embodiment, mutations can be introduced randomly along all or part of a NOVX coding 
sequence, such as by saturation mutagenesis, and the resultant mutants can be screened for 

15 NOVX biological activity to identify mutants that retain activity. Following mutagenesis 
of a nucleic acid of SEQ ID NO:2n-l, wherein n is an integer between 1 and 124, the 
encoded protein can be expressed by any recombinant technology known in the art and the 
activity of the protein can be determined. 

The relatedness of amino acid families may also be determined based on side chain 

20 interactions. Substituted amino acids may be fully conserved "strong" residues or fully 
conserved "weak" residues. The "strong" group of conserved amino acid residues may be 
any one of the following groups: STA, NEQK, NHQK, NDEQ, QHRK, MILV, MHJF, HY, 
FYW, wherein the single letter amino acid codes are grouped by those amino acids that 
may be substituted for each other. Likewise, the "weak" group of conserved residues may 

25 be any one of the following: CS A, ATV, SAG, STNK, STPA, SGND, SNDEQK, 

NDEQHK, NEQHRK, HFY, wherein the letters within each group represent the single 
letter amino acid code. 

In one embodiment, a mutant NOVX protein can be assayed for (i) the ability to 
form proteinrprotein interactions with other NOVX proteins, other cell-surface proteins, or 

30 biologically-active portions thereof, (ii) complex formation between a mutant NOVX 
protein and a NOVX ligand; or (Hi) the ability of a mutant NOVX protein to bind to an 
intracellular target protein or biologically-active portion thereof; (e.g. avidin proteins). 
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In yet another embodiment, a mutant NOVX protfiiforf tie MgSy%&%f tti€&ftfflif 
to regulate a specific biological function (e.g., regulation of insulin release). 

Interfering RNA 

In one aspect of the invention, NOVX gene expression can be attenuated by RNA 
5 interference. One approach well-known in the art is short interfering RNA (siRNA) 
mediated gene silencing where expression products of a NOVX gene are targeted by 
specific double stranded NOVX derived siRNA nucleotide sequences that are 
complementary to at least, a 19-25 nt long segment of the NOVX gene transcript, including 
the 5' untranslated (UT) region, the ORF, or the 3' UT region. See, e.g., PCT applications 

10 WO00/44895, W099/32619, WO01/75164, WO01/92513, WO 01/29058, WO01/89304, 
WO02/16620, and WO02/29858, each incorporated by reference herein in their entirety. 
Targeted genes can be a NOVX gene, or an upstream or downstream modulator of the 
NOVX gene. Nonlimiting examples of upstream or downstream modulators of a NOVX 
gene include, e.g., a transcription factor that binds the NOVX gene promoter, a kinase or 

15 phosphatase that interacts with a NOVX polypeptide, and polypeptides involved in a 
NOVX regulatory pathway. 

According to the methods of the present invention, NOVX gene expression is 
silenced using short interfering RNA. A NOVX polynucleotide according to the invention 
includes a siRNA polynucleotide. Such a NOVX siRNA can be obtained using a NOVX 

20 polynucleotide sequence, for example, by processing the NOVX ribopolynucleotide 
sequence in a cell-free system, such as but not limited to a Drosophila extract, or by 
transcription of recombinant double stranded NOVX RNA or by chemical synthesis of 
nucleotide sequences homologous to a NOVX sequence. See, e.g., Tuschl, Zamore, 
Lehmann, Bartel and Sharp (1999), Genes & Dev. 13: 3191-3197, incorporated herein by 

25 reference in its entirety. When synthesized, a typical 0.2 micromolar-scale RNA synthesis 
provides about 1 milligram of siRNA, which is sufficient for 1000 transfection experiments 
using a 24-well tissue culture plate format. 

The most efficient silencing is generally observed with siRNA duplexes composed 
of a 21-nt sense strand and a 21-nt antisense strand, paired in a manner to have a 2-nt 

30 3* overhang. The sequence of the 2-nt 3* overhang makes an additional small contribution 
to the specificity of siRNA target recognition. The contribution to specificity is localized to 
the unpaired nucleotide adjacent to the first paired bases. In one embodiment, the 
nucleotides in the 3* overhang are ribonucleotides. In an alternative embodiment, the 
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nucleotides in the 3' overhang are deoxyribonucleotides. "USfhg Z'Hj^ynb'ChucIebffffes in " 
the 3' overhangs is as efficient as using ribonucleotides, but deoxyribonucleotides are often 
cheaper to synthesize and are most likely more nuclease resistant. 

A contemplated recombinant expression vector of the invention comprises a NOVX 
DNA molecule cloned into an expression vector comprising operatively-linked regulatory 
sequences flanking the NOVX sequence in a manner that allows for expression (by 
transcription of the DNA molecule) of both strands. An RNA molecule that is antisense to 
NOVX mRNA is transcribed by a first promoter (e.g., a promoter sequence 3' of the cloned 
DNA) and an RNA molecule that is the sense strand for the NOVX mRNA is transcribed 
by a second promoter (e.g., a promoter sequence 5' of the cloned DNA). The sense and 
antisense strands may hybridize in vivo to generate siRNA constructs for silencing of the 
NOVX gene. Alternatively, two constructs can be utilized to create the sense and 
anti-sense strands of a siRNA construct. Finally, cloned DNA can encode a construct 
having secondary structure, wherein a single transcript has both the sense and 
complementary antisense sequences from the target gene or genes. In an example of this 
embodiment, a hairpin RNAi product is homologous to all or a portion of the target gene. 
In another example, a hairpin RNAi product is a siRNA. The regulatory sequences 
flanking the NOVX sequence may be identical or may be different, such that their 
expression may be modulated independently, or in a temporal or spatial manner. 

In a specific embodiment, siRNAs are transcribed intracellularly by cloning the 
NOVX gene templates into a vector containing, e.g., a RNA pol III transcription unit from 
the smaller nuclear RNA (snRNA) U6 or the human RNase P RNA HI. One example of a 
vector system is the GeneSuppressor™ RNA Interference kit (commercially available from 
Imgenex). The U6 and HI promoters are members of the type EI class of Pol m promoters. 
The +1 nucleotide of the U6-like promoters is always guanosine, whereas the +1 for HI 
promoters is adenosine. The termination signal for these promoters is defined by five 
consecutive thymidines. The transcript is typically cleaved after the second uridine. 
Cleavage at this position generates a 3' UU overhang in the expressed siRNA, which is 
similar to the 3' overhangs of synthetic siRNAs. Any sequence less than 400 nucleotides in 
length can be transcribed by these promoter, therefore they are ideally suited for the 
expression of around 21-nucleotide siRNAs in, e.g., an approximately 50-micleotide RNA 
stem-loop transcript. 
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A siRNA vector appears to have an advantage ovfer ^Ihfti&^fltP^^hdNrtu^ 
term knock-down of expression is desired. Cells transfected with a siRNA expression 
vector would experience steady, long-term mRNA inhibition. In contrast, cells transfected 
with exogenous synthetic siRNAs typically recover from mRNA suppression within seven 

5 days or ten rounds of cell division. The long-term gene silencing ability of siRNA 
expression vectors may provide for applications in gene therapy. 

In general, siRNAs are chopped from longer dsRNA by an ATP-dependent 
ribonuclease called DICER. DICER is a member of the RNase III family of 
double-stranded RNA-specific endonucleases. The siRNAs assemble with cellular proteins 

10 into an endonuclease complex. In vitro studies in Drosophila suggest that the 

siRNAs/protein complex (siRNP) is then transferred to a second enzyme complex, called 
an RNA-induced silencing complex (RISC), which contains an endoribonuclease that is 
distinct from DICER. RISC uses the sequence encoded by the antisense siRNA strand to 
find and destroy mRNAs of complementary sequence. The siRNA thus acts as a guide, 

15 restricting the ribonuclease to cleave only mRNAs complementary to one of the two siRNA 
strands. 

A NOVX mRNA region to be targeted by siRNA is generally selected from a 
desired NOVX sequence beginning 50 tolOO nt downstream of the start codon. 
Alternatively, 5' or 3' UTRs and regions nearby the start codon can be used but are 

20 generally avoided, as these may be richer in regulatory protein binding sites. UTR-binding 
proteins and/or translation initiation complexes may interfere with binding of the siRNP or 
RISC endonuclease complex. An initial BLAST homology search for the selected siRNA 
sequence is done against an available nucleotide sequence library to ensure that only one 
gene is targeted. Specificity of target recognition by siRNA duplexes indicate that a single 

25 point mutation located in the paired region of an siRNA duplex is sufficient to abolish 
target mRNA degradation. See, Elbashir et al 2001 EMBO J. 20(23):6877-88. Hence, 
consideration should be taken to accommodate SNPs, polymorphisms, allelic variants or 
species-specific variations when targeting a desired gene. 

In one embodiment, a complete NOVX siRNA experiment includes the proper 

30 negative control. A negative control siRNA generally has the same nucleotide composition 
as the NOVX siRNA but lack significant sequence homology to the genome. Typically, 
one would scramble the nucleotide sequence of the NOVX siRNA and do a homology 
search to make sure it lacks homology to any other gene. 

28 



WO 03/029424 



PCT/US02/31373 



Two independent NOVX siRNA duplexes can be"us%a fold&Etotfft a target* 
NOVX gene. This helps to control for specificity of the silencing effect. In addition, 
expression of two independent genes can be simultaneously knocked down by using equal 
concentrations of different NOVX siRNA duplexes, e.g., a NOVX siRNA and an siRNA 
5 for a regulator of a NOVX gene or polypeptide. Availability of siRNA-associating proteins 
is believed to be more limiting than target mRNA accessibility. 

A targeted NOVX region is typically a sequence of two adenines (AA) and two 
thymidines (TT) divided by a spacer region of nineteen (N19) residues (e.g., AA(N19)TT). 
A desirable spacer region has a G/C-content of approximately 30% to 70%, and more 

10 preferably of about 50%. If the sequence AA(N19)TT is not present in the target sequence, 
. an alternative target region would be AA(N21). The sequence of the NOVX sense siRNA 
corresponds to (N19)TT or N21, respectively. In the latter case, conversion of the 3' end of 
the sense siRNA to TT can be performed if such a sequence does not naturally occur in the 
NOVX polynucleotide. The rationale for this sequence conversion is to generate a 

15 symmetric duplex with respect to the sequence composition of the sense and antisense 3' 
overhangs. Symmetric 3' overhangs may help to ensure that the siRNPs are formed with 
approximately equal ratios of sense and antisense target RNA-cleaving siRNPs. See, e.g., 
Elbashir, Lendeckel and Tuschl (2001). Genes & Dev. 15: 188-200, incorporated by 
reference herein in its entirely. The modification of the overhang of the sense sequence of 

20 the siRNA duplex is not expected to affect targeted mRNA recognition, as the antisense 
siRNA strand guides target recognition. 

Alternatively, if the NOVX target mRNA does not contain a suitable AA(N21) 
sequence, one may search for the sequence NA(N21). Further, the sequence of the sense 
strand and antisense strand may still be synthesized as 5' (N19)TT, as it is believed that the 

25 sequence of the 3-most nucleotide of the antisense siRNA does not contribute to 

specificity. Unlike antisense or ribozyme technology, the secondary structure of the target 
mRNA does not appear to have a strong effect on silencing. See, Harborth, et al. (2001) J. 
Cell Science 1 14: 4557-4565, incorporated by reference in its entirety. 

Transfection of NOVX siRNA duplexes can be achieved using standard nucleic 

30 acid transfection methods, for example, OLIGOFECTAMINE Reagent (commercially 
available from Invitrogen). An assay for NOVX gene silencing is generally performed 
approximately 2 days after transfection. No NOVX gene silencing has been observed in 
the absence of transfection reagent, allowing for a comparative analysis of the wild-type 
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and silenced NOVX phenotypes. In a specific embodiment,Tfor 6ne*well of a 24- well plate, 
approximately 0.84 fig of the siRNA duplex is generally sufficient. Cells are typically 
seeded the previous day, and are transfected at about 50% confluence. The choice of cell 
culture media and conditions are routine to those of skill in the art, and will vary with the 

5 choice of cell type. The efficiency of transfection may depend on the cell type, but also on 
the passage number and the confluency of the cells. The time and the manner of formation 
of siRNA-liposome complexes (e.g. inversion versus vortexing) are also critical. Low 
transfection efficiencies are the most frequent cause of unsuccessful NOVX silencing. The 
efficiency of transfection needs to be carefully examined for each new cell line to be used. 

10 Preferred cell are derived from a mammal, more preferably from a rodent such as a rat or 
mouse, and most preferably from a human. Where used for therapeutic treatment, the cells 
are preferentially autologous, although non-autologous cell sources are also contemplated 
as within the scope of the present invention. 

For a control experiment, transfection of 0.84 fig single-stranded sense NOVX 

15 siRNA will have no effect on NOVX silencing, and 0.84 fig antisense siRNA has a weak 
silencing effect when compared to 0.84 fig of duplex siRNAs. Control experiments again 
allow for a comparative analysis of the wild-type and silenced NOVX phenotypes. To 
control for transfection efficiency, targeting of common proteins is typically performed, for 
example targeting of lamin A/C or transfection of a CMV-driven EGFP-expression plasmid 

20 (e.g. commercially available from Clontech). In the above example, a determination of the 
fraction of lamin A/C knockdown in cells is determined the next day by such techniques as 
immunofluorescence, Western blot, Northern blot or other similar assays for protein 
expression or gene expression. Lamin A/C monoclonal antibodies may be obtained from 
Santa Cruz Biotechnology. 

25 Depending on the abundance and the half life (or turnover) of the targeted NOVX 

polynucleotide in a cell, a knock-down phenotype may become apparent after 1 to 3 days, 
or even later. In cases where no NOVX knock-down phenotype is observed, depletion of 
the NOVX polynucleotide may be observed by immunofluorescence or Western blotting. 
If the NOVX polynucleotide is still abundant after 3 days, cells need to be split and 

30 transferred to a fresh 24-well plate for re-transfection. If no knock-down of the targeted 
protein is observed, it may be desirable to analyze whether the target mRNA (NOVX or a 
NOVX upstream or downstream gene) was effectively destroyed by the transfected siRNA 
duplex. Two days after transfection, total RNA is prepared, reverse transcribed using a 
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target-specific primer, and PCR-amplified with a primer ^mTcdveringlftTedSt one* 
exon-exon junction in order to control for amplification of pre-mRNAs. RT/PCR of a 
non-targeted mRNA is also needed as control. Effective depletion of the mRNA yet 
undetectable reduction of target protein may indicate that a large reservoir of stable NOVX 
5 protein may exist in the cell. Multiple transfection in sufficiently long intervals may be 
necessary until the target protein is finally depleted to a point where a phenotype may 
become apparent. If multiple transfection steps are required, cells are split 2 to 3 days after 
transfection. The cells may be transfected immediately after splitting. 

An inventive therapeutic method of the invention contemplates administering a 

10 NOVX siRNA construct as therapy to compensate for increased or aberrant NOVX 
expression or activity. The NOVX ribopolynucleotide is obtained and processed into 
siRNA fragments, or a NOVX siRNA is synthesized, as described above. The NOVX 
siRNA is administered to cells or tissues using known nucleic acid transfection techniques, 
as described above. A NOVX siRNA specific for a NOVX gene will decrease or 

15 knockdown NOVX transcription products, which will lead to reduced NOVX polypeptide 
production, resulting in reduced NOVX polypeptide activity in the cells or tissues. 

The present invention also encompasses a method of treating a disease or condition 
associated with the presence of a NOVX protein in an individual comprising administering 
to the individual an RNAi construct that targets the mRNA of the protein (the mRNA that 

20 encodes the protein) for degradation. A specific RNAi construct includes a siRNA or a 
double stranded gene transcript that is processed into siRNAs. Upon treatment, the target 
protein is not produced or is not produced to the extent it would be in the absence of the 
treatment. 

Where the NOVX gene function is not correlated with a known phenotype, a 
25 control sample of cells or tissues from healthy individuals provides a reference standard for 
determining NOVX expression levels. Expression levels are detected using the assays 
described, e.g., RT-PCR, Northern blotting, Western blotting, EUSA, and the like. A 
subject sample of cells or tissues is taken from a mammal, preferably a human subject, 
suffering from a disease state. The NOVX ribopolynucleotide is used to produce siRNA 
30 constructs, that are specific for the NOVX gene product. These cells or tissues are treated 
by administering NOVX siRNA' s to the cells or tissues by methods described for the 
transfection of nucleic acids into a cell or tissue, and a change in NOVX polypeptide or 
polynucleotide expression is observed in the subject sample relative to the control sample, 
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using the assays described. This NOVX gene knockdowfl apprddLch^OVmdrtf raip5d BM 
method for determination of a NOVX minus (NOVX) phenotype in the treated subject 
sample. The NOVX" phenotype observed in the treated subject sample thus serves as a 
marker for monitoring the course of a disease state during treatment. 
5 In specific embodiments, a NOVX siRNA is used in therapy. Methods for the 

generation and use of a NOVX siRNA are known to those skilled in the art. Example 
techniques are provided below. 

Production of RNAs 

Sense RNA (ssRNA) and antisense RNA (asRNA) of NOVX are produced using 
10 known methods such as transcription in RNA expression vectors. In the initial 

experiments, the sense and antisense RNA are about 500 bases in length each. The 
produced ssRNA and asRNA (0.5 pM) in 10 mM Tris-HCl (pH 7.5) with 20 mM NaCl 
were heated to 95° C for 1 min then cooled and annealed at room temperature for 12 to 16 
h. The RNAs are precipitated and resuspended in lysis buffer (below). To monitor 
15 annealing, RNAs are electrophoresed in a 2% agarose gel in TBE buffer and stained with 
ethidium bromide. See, e.g., Sambrook et al., Molecular Cloning. Cold Spring Harbor 
Laboratory Press, Plainview, N.Y. (1989). 

Lysate Preparation 

Untreated rabbit reticulocyte lysate (Ambion) are assembled according to the 
20 manufacturer's directions. dsRNA is incubated in the lysate at 30° C for 10 min prior to the 
addition of mRNAs. Then NOVX mRNAs are added and the incubation continued for an 
additional 60 min. The molar ratio of double stranded RNA and mRNA is about 200:1. 
The NOVX mRNA is radiolabeled (using known techniques) and its stability is monitored 
by gel electrophoresis. 

25 In a parallel experiment made with the same conditions, the double stranded RNA is 

internally radiolabeled with a 32 P-ATP. Reactions are stopped by the addition of 2 X 
proteinase K buffer and deproteinized as described previously (Tuschl et al> Genes Dev., 
13:3191-3197 (1999)). Products are analyzed by electrophoresis in 15% or 18% 
polyacrylamide sequencing gels using appropriate RNA standards. By monitoring the gels 

30 for radioactivity, the natural production of 10 to 25 nt RNAs from the double stranded 
RNA can be determined. 
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The band of double stranded RNA, about 21-23 tJps'M's feltidSarTTre'tffic^bf 4 
these 21-23 mers for suppressing NOVX transcription is assayed in vitro using the same 
rabbit reticulocyte assay described above using 50 nanomolar of double stranded 21-23 mer 
for each assay. The sequence of these 21-23 mers is then determined using standard 
5 nucleic acid sequencing techniques. 

RNA Preparation 

21 nt RNAs, based on the sequence determined above, are chemically synthesized 
using Expedite RNA phosphoramidites and thymidine phosphoramidite (Proligo, 
Germany). Synthetic oligonucleotides are deprotected and gel-purified (Elbashir, 
10 Lendeckel, & Tuschl, Genes & Dev. 15, 188-200 (2001)), followed by Sep-Pak C18 
cartridge (Waters, Milford, Mass., USA) purification (Tuschl, et aL, Biochemistry, 
32:11658-11668(1993)). 

These RNAs (20 |iM) single strands are incubated in annealing buffer (100 mM 
potassium acetate, 30 mM HEPES-KOH at pH 7.4, 2 mM magnesium acetate) for 1 min at 
15 90° C followed by 1 h at 37° C. 

Cell Culture 

A cell culture known in the art to regularly express NOVX is propagated using 
standard conditions. 24 hours before transfection, at approx. 80% confluency, the cells are 
trypsinized and diluted 1:5 with fresh medium without antibiotics (1-3 X 105 cells/ml) and 

20 transferred to 24-well plates (500 ml/well). Transfection is performed using a 

commercially available lipofection kit and NOVX expression is monitored using standard 
techniques with positive and negative control A positive control is cells that naturally 
express NOVX while a negative control is cells that do not express NOVX. Base-paired 21 
and 22 nt siRNAs with overhanging 3* ends mediate efficient sequence-specific mRNA 

25 degradation in lysates and in cell culture. Different concentrations of siRNAs are used. An 
efficient concentration for suppression in vitro in mammalian culture is between 25 nM to 
100 nM final concentration. This indicates that siRNAs are effective at concentrations that 
are several orders of magnitude below the concentrations applied in conventional antisense 
or ribozyme gene targeting experiments. 

30 The above method provides a way both for the deduction of NOVX siRNA 

sequence and the use of such siRNA for in vitro suppression. In vivo suppression may be 
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performed using the same siRNA using well known in \d^o'tfarfsfectfbi1 ,, 6i |, ggrfe ffihfbpf * 
transfection techniques. 

Antisense Nucleic Acids 

Another aspect of the invention pertains to isolated antisense nucleic acid molecules 
5 that are hybridizable to or complementary to the nucleic acid molecule comprising the 
nucleotide sequence of SEQ ID NO:2/i-l, wherein n is an integer between 1 and 124, or 
fragments, analogs or derivatives thereof. An "antisense" nucleic acid comprises a 
nucleotide sequence that is complementary to a "sense" nucleic acid encoding a protein 
(e.g., complementary to the coding strand of a double-stranded cDNA molecule or 

10 complementary to an mRNA sequence). In specific aspects, antisense nucleic acid 

molecules are provided that comprise a sequence complementary to at least about 10, 25, 
50, 100, 250 or 500 nucleotides or an entire NOVX coding strand, or to only a portion 
thereof. Nucleic acid molecules encoding fragments, homologs, derivatives and analogs of 
a NOVX protein of SEQ ID NO:2n, wherein n is an integer between 1 and 124, or 

15 antisense nucleic acids complementary to a NOVX nucleic acid sequence of SEQ ID 
NO:2rc-l, wherein n is an integer between 1 and 124, are additionally provided. 

In one embodiment, an antisense nucleic acid molecule is antisense to a "coding 
region" of the coding strand of a nucleotide sequence encoding a NOVX protein. The term 
"coding region" refers to the region of the nucleotide sequence comprising codons which 

20 are translated into amino acid residues. In another embodiment, the antisense nucleic acid 
molecule is antisense to a "noncoding region" of the coding strand of a nucleotide sequence 
encoding the NOVX protein. The term "noncoding region" refers to 5' and 3 r sequences 
which flank the coding region that are not translated into amino acids (i.e., also referred to 
as 5' and 3' untranslated regions). 

25 Given the coding strand sequences encoding the NOVX protein disclosed herein, 

antisense nucleic acids of the invention can be designed according to the rules of Watson 
and Crick or Hoogsteen base pairing. The antisense nucleic acid molecule can be 
complementary to the entire coding region of NOVX mRNA, but more preferably is an 
oligonucleotide that is antisense to only a portion of the coding or noncoding region of 

30 NOVX mRNA. For example, the antisense oligonucleotide can be complementary to the 
region surrounding the translation start site of NOVX mRNA. An antisense 
oligonucleotide can be, for example, about 5, 10, 15, 20, 25, 30, 35, 40, 45 or 50 
nucleotides in length. An antisense nucleic acid of the invention can be constructed using 
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chemical synthesis or enzymatic ligation reactions using |>rdSedifres , ^5wnTh ,, the art Tor 
example, an antisense nucleic acid (e.g., an antisense oligonucleotide) can be chemically 
synthesized using naturally-occurring nucleotides or variously modified nucleotides 
designed to increase the biological stability of the molecules or to increase the physical 

5 stability of the duplex formed between the antisense and sense nucleic acids (e.g., 
phosphorothioate derivatives and acridine substituted nucleotides can be used). 

Examples of modified nucleotides that can be used to generate the antisense nucleic 
acid include: 5-fluorouracil, 5-bromouracil, 5-chlorouracil, 5-iodouracil, hypoxanthine, 
xanthine, 4-acetylcytosine, 5-carboxymethylaminomethyl-2-thiouridine, 

10 5-(carboxyhydroxylmethyl) uracil, 5-carboxymethylaminomethyluracil, dihydrouracil, 
beta-D-galactosylqueosine, inosine, N6-isopentenyladenine, 1-methylguanine, 

1- methylinosine, 2,2-dimethylguanine, 2-methyladenine, 2-methylguanine, 
5-methoxyuracil, 3-methylcytosine, 5-methylcytosine, N6-adenine, 7-methylguanine, 
5-methylaminomethyluracil, 5-methoxyaminomethyl-2-thiouracil, 2-thiouracil, 

15 4-thiouracil, beta-D-mannosylqueosine, 5'-methoxycarboxymethyluracil, 

2- methylthio-N6-isopentenyladenine, uracil-5-oxyacetic acid (v), wybutoxosine, 
pseudouracil, queosine, 2-thiocytosine, 5-methyl-2-thiouracil, 5-methyluracil, 
uracil-5-oxyacetic acid methylester, uracil-5-oxyacetic acid (v), 5-methyl-2--thiouracil, 

3- (3-amino-3-N-2-carboxypropyl) uracil, (acp3)w, and 2,6-diaminopurine. Alternatively, 
20 the antisense nucleic acid can be produced biologically using an expression vector into 

which a nucleic acid has been subcloned in an antisense orientation (Le., RNA transcribed 
from the inserted nucleic acid will be of an antisense orientation to a target nucleic acid of 
interest, described further in the following subsection). 

The antisense nucleic acid molecules of the invention are typically administered to a 

25 subject or generated in situ such that they hybridize with or bind to cellular mRNA and/or 
genomic DNA encoding a NOVX protein to thereby inhibit expression of the protein (e.g., 
by inhibiting transcription and/or translation). The hybridization can be by conventional 
nucleotide complementarity to form a stable duplex, or, for example, in the case of an 
antisense nucleic acid molecule that binds to DNA duplexes, through specific interactions 

30 in the major groove of the double helix. An example of a route of administration of 

antisense nucleic acid molecules of the invention includes direct injection at a tissue site. 
Alternatively, antisense nucleic acid molecules can be modified to target selected cells and 
then administered systemically. For example, for systemic administration, antisense 
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molecules can be modified such that they specifically bin'ff t^'rdbfiptbf^'fif ifitigerti? ' 
expressed on a selected cell surface (e.g., by linking the antisense nucleic acid molecules to 
peptides or antibodies that bind to cell surface receptors or antigens). The antisense nucleic 
acid molecules can also be delivered to cells using the vectors described herein. To achieve 
5 sufficient nucleic acid molecules, vector constructs in which the antisense nucleic acid 
molecule is placed under the control of a strong pol II or pol HI promoter are preferred. 

In yet another embodiment, the antisense nucleic acid molecule of the invention is 
an oc-anomeric nucleic acid molecule. An cc-anomeric nucleic acid molecule forms specific 
double-stranded hybrids with complementary RNA in which, contrary to the usual P-units, 
10 the strands run parallel to each other. See, e.g., Gaultier, et al., 1987. NucL Acids Res. 15: 
6625-6641. The antisense nucleic acid molecule can also comprise a 
2-o-methylribonucleotide (See, e.g., Inoue, et al. 1987. NucL Acids Res. 15: 6131-6148) or 
a chimeric RNA-DNA analogue (See, e.g., Inoue, etal, 1987. FEBS Lett. 215: 327-330. 

Ribozymes and PNA Moieties 

15 Nucleic acid modifications include, by way of non-limiting example, modified 

bases, and nucleic acids whose sugar phosphate backbones are modified or derivatized. 
These modifications are carried out at least in part to enhance the chemical stability of the 
modified nucleic acid, such that they may be used, for example, as antisense binding 
nucleic acids in therapeutic applications in a subject. 

20 In one embodiment, an antisense nucleic acid of the invention is a ribozyme. 

Ribozymes are catalytic RNA molecules with ribonuclease activity that are capable of 
cleaving a single-stranded nucleic acid, such as an mRNA, to which they have a 
complementary region. Thus, ribozymes (e.g., hammerhead ribozymes as described in 
Haselhoff and Gerlach 1988. Nature 334: 585-591) can be used to catalytically cleave 

25 NOVX mRNA transcripts to thereby inhibit translation of NOVX mRNA. A ribozyme 
having specificity for a NOVX-encoding nucleic acid can be designed based upon the 
nucleotide sequence of a NOVX cDNA disclosed herein (i.e., SEQ ID NO:2n-l, wherein n 
is an integer between 1 and 124). For example, a derivative of a Tetrahymena L-19 F/S 
RNA can be constructed in which the nucleotide sequence of the active site is 

30 complementary to the nucleotide sequence to be cleaved in a NOVX-encoding mRNA. 
See, e.g., U.S. Patent 4,987,071 to Cech, et al. and U.S. Patent 5,116,742 to Cech, et al 
NOVX mRNA can also be used to select a catalytic RNA having a specific ribonuclease 
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activity from a pool of RNA molecules. See, e.g., Bartel $tdlt\ (J1993? SSMiEf' 
261:1411-1418. 

Alternatively, NOVX gene expression can be inhibited by targeting nucleotide 
sequences complementary to the regulatory region of the NOVX nucleic acid (e.g., the 
5 NOVX promoter and/or enhancers) to form triple helical structures that prevent 

transcription of the NOVX gene in target cells. See, e.g., Helene, 1991. Anticancer Drug 
Des. 6: 569-84; Helene, et al 1992. Ann. N.Y. Acad. Set 660: 27-36; Maher, 1992. 
Bioassays 14: 807-15. 

. In various embodiments, the NOVX nucleic acids can be modified at the base 

10 moiety, sugar moiety or phosphate backbone to improve, e.g., the stability, hybridization, 
or solubility of the molecule. For example, the deoxyribose phosphate backbone of the 
nucleic acids can be modified to generate peptide nucleic acids. See, e.g., Hyrup, et al., 
1996. Bioorg MedChem 4: 5-23. As used herein, the terms "peptide nucleic acids" or 
"PNAs" refer to nucleic acid mimics (e.g., DNA mimics) in which the deoxyribose 

15 phosphate backbone is replaced by a pseudopeptide backbone and only the four natural 
nucleotide bases are retained. The neutral backbone of PNAs has been shown to allow for 
specific hybridization to DNA and RNA under conditions of low ionic strength. The 
synthesis of PNA oligomer can be performed using standard solid phase peptide synthesis 
protocols as described in Hyrup, et al, 1996. supra; Perry-OTCeefe, et al, 1996. Proc. Natl 

20 Acad. Sci. USA 93: 14670-14675. 

PNAs of NOVX can be used in therapeutic and diagnostic applications. For 
example, PNAs can be used as antisense or antigene agents for sequence-specific 
modulation of gene expression by, e.g. , inducing transcription or translation arrest or 
inhibiting replication. PNAs of NOVX can also be used, for example, in the analysis of 

25 single base pair mutations in a gene (e.g., PNA directed PCR clamping; as artificial 

restriction enzymes when used in combination with other enzymes, e.g., Si nucleases (See, 
Hyrup, et al, 1996.supra); or as probes or primers for DNA sequence and hybridization 
(See, Hyrup, et al, 1996, supra; Perry-CKeefe, et al, 1996. supra). 

In another embodiment, PNAs of NOVX can be modified, e.g., to enhance their 

30 stability or cellular uptake, by attaching lipophilic or other helper groups to PNA, by the 
formation of PNA-DNA chimeras, or by the use of liposomes or other techniques of drug 
delivery known in the art. For example, PNA-DNA chimeras of NOVX can be generated 
that may combine the advantageous properties of PNA and DNA. Such chimeras allow 

37 



WO 03/029424 



PCT/US02/31373 



DNA recognition enzymes (e.g., RNase H and DNA poiyft&fosfes:) tVifflMfir 'tfitft^flNA 
portion while the PNA portion would provide high binding affinity and specificity. 
PNA-DNA chimeras can be linked using linkers of appropriate lengths selected in terms of 
base stacking, number of bonds between the nucleotide bases, and orientation {see, Hyrup, 
5 et al., 1996. supra). The synthesis of PNA-DNA chimeras can be performed as described 
in Hyrup, et al, 1996. supra and Finn, et al, 1996. Nucl Acids Res 24: 3357-3363. For 
example, a DNA chain can be synthesized on a solid support using standard 
phosphoramidite coupling chemistry, and modified nucleoside analogs, e.g., 
5'-(4-methoxytrity])amino-5'-deoxy-thymidine phosphoramidite, can be used between the 

10 PNA and the 5' end of DNA. See, e.g., Mag, et al, 1989. Nucl Acid Res 17: 5973-5988. 
PNA monomers are then coupled in a stepwise manner to produce a chimeric molecule 
with a 5' PNA segment and a 3' DNA segment. See, e.g., Finn, et al, 1996. supra. 
Alternatively, chimeric molecules can be synthesized with a 5' DNA segment and a 3' PNA 
segment. See, e.g., Petersen, etal, 1975. Bioorg. Med. Chem. Lett 5: 1119-11124. 

15 In other embodiments, the oligonucleotide may include other appended groups such 

as peptides {e.g., for targeting host cell receptors in vivo), or agents facilitating transport 
across the cell membrane {see, e.g., Letsinger, et al, 1989. Proc. Natl Acad. Sci. U.SA. 86: 
6553-6556; Lemaitre, et al, 1987. Proc. Natl Acad. Set 84: 648-652; PCT Publication No. 
WO88/09810) or the blood-brain barrier {see, e.g., PCT Publication No. WO 89/10134). In 

20 addition, oligonucleotides can be modified with hybridization triggered cleavage agents 
{see, e.g., Krol, etal, 1988. BioTechniques 6:958-976) or intercalating agents {see, e.g., 
Zon, 1988. Pharm. Res. 5: 539-549). To this end, the oligonucleotide may be conjugated to 
another molecule, e.g., a peptide, a hybridization triggered cross-linking agent, a transport 
agent, a hybridization-triggered cleavage agent, and the like. 

25 NOVX Polypeptides 

A polypeptide according to the invention includes a polypeptide including the 
amino acid sequence of NOVX polypeptides whose sequences are provided in any one of 
SEQ ID NO:2n, wherein n is an integer between 1 and 124. The invention also includes a 
mutant or variant protein any of whose residues may be changed from the corresponding 
30 residues shown in any one of SEQ ID NO:2n, wherein n is an integer between 1 and 124, 
while still encoding a protein that maintains its NOVX activities and physiological 
functions, or a functional fragment thereof. 
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In general, a inu vx vanant that preserves MOVX^lifee flarfctltfirfltelfldds afiy^aadit* 
in which residues at a particular position in the sequence have been substituted by other 
amino acids, and further include the possibility of inserting an additional residue or 
residues between two residues of the parent protein as well as the possibility of deleting 
5 one or more residues from the parent sequence. Any amino acid substitution, insertion, or 
deletion is encompassed by the invention. In favorable circumstances, the substitution is a 
conservative substitution as defined above. 

One aspect of the invention pertains to isolated NOVX proteins, and 
biologically-active portions thereof, or derivatives, fragments, analogs or homologs thereof. 
10 Also provided are polypeptide fragments suitable for use as immunogens to raise 

anti-NOVX antibodies. In one embodiment, native NOVX proteins can be isolated from 
cells or tissue sources by an appropriate purification scheme using standard protein 
purification techniques. In another embodiment, NOVX proteins are produced by 
recombinant DNA techniques. Alternative to recombinant expression, a NOVX protein or 
15 polypeptide can be synthesized chemically using standard peptide synthesis techniques. 

An "isolated" or "purified" polypeptide or protein or biologically-active portion 
thereof is substantially free of cellular material or other contaminating proteins from the 
cell or tissue source from which the NOVX protein is derived, or substantially free from 
chemical precursors or other chemicals when chemically synthesized The language 
20 "substantially free of cellular material" includes preparations of NOVX proteins in which 
the protein is separated from cellular components of the cells from which it is isolated or 
recombinantly-produced. In one embodiment, the language "substantially free of cellular 
material" includes preparations of NOVX proteins having less than about 30% (by dry 
weight) of non-NOVX proteins (also referred to herein as a "contaminating protein"), more 
.25 preferably less than about 20% of non-NOVX proteins, still more preferably less than about 
10% of non-NOVX proteins, and most preferably less than about 5% of non-NOVX 
proteins. When the NOVX protein or biologically-active portion thereof is 
recombinantly-produced, it is also preferably substantially free of culture medium, Le. 9 
culture medium represents less than about 20%, more preferably less than about 10%, and 
30 most preferably less than about 5% of the volume of the NOVX protein preparation. 

The language "substantially free of chemical precursors or other chemicals" 
includes preparations of NOVX proteins in which the protein is separated from chemical 
precursors or other chemicals that are involved in the synthesis of the protein. In one 
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embodiment, the language "substantially free oi chemicaT^rfeeutsti^ 
includes preparations of NOVX proteins having less than about 30% (by dry weight) of 
chemical precursors or non-NOVX chemicals, more preferably less than about 20% 
chemical precursors or non-NOVX chemicals, still more preferably less than about 10% 

5 chemical precursors or non-NOVX chemicals, and most preferably less than about 5% 
chemical precursors or non-NOVX chemicals. 

Biologically-active portions of NOVX proteins include peptides comprising amino 
acid sequences sufficiently homologous to or derived from the amino acid sequences of the 
NOVX proteins (e.g., the amino acid sequence of SEQ ID NO:2n, wherein n is an integer 

10 between 1 and 124) that include fewer amino acids than the full-length NOVX proteins, 
and exhibit at least one activity of a NOVX protein. Typically, biologically-active portions 
comprise a domain or motif with at least one activity of the NOVX protein. A 
biologically-active portion of a NOVX protein can be a polypeptide which is, for example, 
10, 25, 50, 100 or more amino acid residues in length. 

15 Moreover, other biologically-active portions, in which other regions of the protein 

are deleted, can be prepared by recombinant techniques and evaluated for one or more of 
the functional activities of a native NOVX protein. 

In an embodiment, the NOVX protein has an amino acid sequence of SEQ ID 
NO:2«, wherein n is an integer between 1 and 124. In other embodiments, the NOVX 

20 protein is substantially homologous to SEQ ID NO:2w, wherein n is an integer between 1 
and 124, and retains the functional activity of the protein of SEQ ID NO:2n, wherein n is 
an integer between 1 and 124, yet differs in amino acid sequence due to natural allelic 
variation or mutagenesis, as described in detail, below. Accordingly, in another 
embodiment, the NOVX protein is a protein that comprises an amino acid sequence at least 

25 about 45% homologous to the amino acid sequence of SEQ ID NO:2rc, wherein n is an 
integer between 1 and 124, and retains the functional activity of the NOVX proteins of 
SEQ ID NO:2n, wherein n is an integer between 1 and 124. 

Determining Homology Between Two or More Sequences 

To determine the percent homology of two amino acid sequences or of two nucleic 
30 acids, the sequences are aligned for optimal comparison purposes (e.g., gaps can be 
introduced in the sequence of a first amino acid or nucleic acid sequence for optimal 
alignment with a second amino or nucleic acid sequence). The amino acid residues or 
nucleotides at corresponding amino acid positions or nucleotide positions are then 
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compared. When a position in the first sequence is occuplS'd l by%h'e &SiffiWfih6 aefd- 
residue or nucleotide as the corresponding position in the second sequence, then the 
molecules are homologous at that position (i.e., as used herein amino acid or nucleic acid 
"homology" is equivalent to amino acid or nucleic acid "identity"). 
5 The nucleic acid sequence homology may be determined as the degree of identity 

between two sequences. The homology may be determined using computer programs 
known in the art, such as GAP software provided in the GCG program package. See, 
Needleman and Wunsch, 1970. JMol Biol 48: 443-453. Using GCG GAP software with 
the following settings for nucleic acid sequence comparison: GAP creation penalty of 5.0 

10 and GAP extension penalty of 0.3, the coding region of the analogous nucleic acid 

sequences referred to above exhibits a degree of identity preferably of at least 70%, 75%, 
80%, 85%, 90%, 95%, 98%, or 99%, with the CDS (encoding) part of the DNA sequence 
of SEQ ID NO:2n-l, wherein n is an integer between 1 and 124. 

The term "sequence identity" refers to the degree to which two polynucleotide or 

15 polypeptide sequences are identical on a residue-by-residue basis over a particular region of 
comparison. The term "percentage of sequence identity" is calculated by comparing two 
optimally aligned sequences over that region of comparison, determining the number of 
positions at which the identical nucleic acid base (e.g., A, T, C, G, U, or I, in the case of 
nucleic acids) occurs in both sequences to yield the number of matched positions, dividing 

20 the number of matched positions by the total number of positions in the region of 
comparison (i.e., the window size), and multiplying the result by 100 to yield the 
percentage of sequence identity. The term "substantial identity" as used herein denotes a 
characteristic of a polynucleotide sequence, wherein the polynucleotide comprises a 
sequence that has at least 80 percent sequence identity, preferably at least 85 percent 

25 identity and often 90 to 95 percent sequence identity, more usually at least 99 percent 
sequence identity as compared to a reference sequence over a comparison region. 

Chimeric and Fusion Proteins 

The invention also provides NOVX chimeric or fusion proteins. As used herein, a 
NOVX "chimeric protein" or "fusion protein" comprises a NOVX polypeptide 
30 operatively-linked to a non-NOVX polypeptide. An "NOVX polypeptide" refers to a 

polypeptide having an amino acid sequence corresponding to a NOVX protein of SEQ ID 
NO:2n, wherein n is an integer between 1 and 124, whereas a "non-NOVX polypeptide" 
refers to a polypeptide having an amino acid sequence corresponding to a protein that is not 
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substantially homologous to the NOVX protein, e.g., a piftteSrft fliat ^SSMSi^xoSkhi 
NOVX protein and that is derived from the same or a different organism. Within a NOVX 
fusion protein the NOVX polypeptide can correspond to all or a portion of a NOVX 
protein. In one embodiment, a NOVX fusion protein comprises at least one 
5 biologically-active portion of a NOVX protein. In another embodiment, a NOVX fusion 
protein comprises at least two biologically-active portions of a NOVX protein. In yet 
another embodiment, a NOVX fusion protein comprises at least three biologically-active 
portions of a NOVX protein. Within the fusion protein, the term "operatively-linked" is 
intended to indicate that the NOVX polypeptide and the non-NOVX polypeptide are fused 

10 in-frame with one another. The non-NOVX polypeptide can be fused to the N-terminus or 
C-terminus of the NOVX polypeptide. 

In one embodiment, the fusion protein is a GST-NO VX fusion protein in which the . 
NOVX sequences are fused to the C-terminus of the GST (glutathione S-transferase) 
sequences. Such fusion proteins can facilitate the purification of recombinant NOVX 

15 polypeptides. 

In another embodiment, the fusion protein is a NOVX protein containing a 
heterologous signal sequence at its N-terminus. In certain host cells (e.g., mammalian host 
cells), expression and/or secretion of NOVX can be increased through use of a 
heterologous signal sequence. 

20 In yet another embodiment, the fusion protein is a NOVX-immunoglobulin fusion 

protein in which the NOVX sequences are fused to sequences derived from a member of 
the immunoglobulin protein family. The NOVX-immunoglobulin fusion proteins of the 
invention can be incorporated into pharmaceutical compositions and administered to a 
subject to inhibit an interaction between a NOVX ligand and a NOVX protein on the 

25 surface of a cell, to thereby suppress NOVX-mediated signal transduction in vivo. The 
NOVX-immunoglobulin fusion proteins can be used to affect the bioavailability of a 
NOVX cognate ligand. Inhibition of the NOVX ligand/NOVX interaction may be useful 
therapeutically for both the treatment of proliferative and differentiative disorders, as well 
as modulating (e.g. promoting or inhibiting) cell survival. Moreover, the 

30 NOVX-immunoglobulin fusion proteins of the invention can be used as immunogens to 
produce anti-NOVX antibodies in a subject, to purify NOVX ligands, and in screening 
assays to identify molecules that inhibit the interaction of NOVX with a NOVX ligand. 
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A NOVX chimeric or fusion protein of the inventi^Q3V|A^l^^/s^ri&a^ 
recombinant DNA techniques. For example, DNA fragments coding for the different 
polypeptide sequences are ligated together in-frame in accordance with conventional 
techniques, e.g., by employing blunt-ended or stagger-ended termini for ligation, restriction 
5 enzyme digestion to provide for appropriate termini, filling-in of cohesive ends as 

appropriate, alkaline phosphatase treatment to avoid undesirable joining, and enzymatic 
ligation. In another embodiment, the fusion gene can be synthesized by conventional 
techniques including automated DNA synthesizers. Alternatively, PCR amplification of 
gene fragments can be carried out using anchor primers that give rise to complementary 

10 overhangs between two consecutive gene fragments that can subsequently be annealed and 
reamplified to generate a chimeric gene sequence (see, e.g., Ausubel, et al (eds.) Current 
Protocols in Molecular Biology, John Wiley & Sons, 1992). Moreover, many 
expression vectors are commercially available that already encode a fusion moiety (e.g., a 
GST polypeptide). A NOVX-encoding nucleic acid can be cloned into such an expression 

15 vector such that the fusion moiety is linked in-frame to the NOVX protein. 

NOVX Agonists and Antagonists 

The invention also pertains to variants of the NOVX proteins that function as either 
NOVX agonists (i.e. t mimetics) or as NOVX antagonists. Variants of the NOVX protein 
can be generated by mutagenesis (e.g., discrete point mutation or truncation of the NOVX 

20 protein). An agonist of the NOVX protein can retain substantially the same, or a subset of, 
the biological activities of the naturally occurring form of the NOVX protein. An 
antagonist of the NOVX protein can inhibit one or more of the activities of the naturally 
occurring form of the NOVX protein by, for example, competitively binding to a 
downstream or upstream member of a cellular signaling cascade which includes the NOVX 

25 protein. Thus, specific biological effects can be elicited by treatment with a variant of 

limited function. In one embodiment, treatment of a subject with a variant having a subset 
of the biological activities of the naturally occurring form of the protein has fewer side 
effects in a subject relative to treatment with the naturally occurring form of the NOVX 
proteins. 

30 Variants of the NOVX proteins that function as either NOVX agonists (i.e. t 

mimetics) or as NOVX antagonists can be identified by screening combinatorial libraries of 
mutants (e.g., truncation mutants) of the NOVX proteins for NOVX protein agonist or 
antagonist activity. In one embodiment, a variegated library of NOVX variants is 
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generated by combinatorial mutagenesis at the nucleic acfSJKfvel sEnW^A^iecfb^^ 
variegated gene library. A variegated library of NOVX variants can be produced by, for 
example, enzymatically ligating a mixture of synthetic oligonucleotides into gene 
sequences such that a degenerate set of potential NOVX sequences is expressible as 
5 individual polypeptides, or alternatively, as a set of larger fusion proteins (e.g., for phage 
display) containing the set of NOVX sequences therein. There are a variety of methods 
which can be used to produce libraries of potential NOVX variants from a degenerate 
oligonucleotide sequence. Chemical synthesis of a degenerate gene sequence can be 
performed in an automatic DNA synthesizer, and the synthetic gene then ligated into an 
10 appropriate expression vector. Use of a degenerate set of genes allows for the provision, in 
one mixture, of all of the sequences encoding the desired set of potential NOVX sequences. 
Methods for synthesizing degenerate oligonucleotides are well-known within the art. See, 
e.g., Narang, 1983. Tetrahedron 39: 3; Itakura, et al, 1984. Annu. Rev. Biochem. 53: 323; 
Itakura, et al, 1984. Science 198: 1056; Ike, et al, 1983. NucL Acids Res. 11: 477. 

15 Polypeptide Libraries 

In addition, libraries of fragments of the NOVX protein coding sequences can be 
used to generate a variegated population of NOVX fragments for screening and subsequent 
selection of variants of a NOVX protein. In one embodiment, a library of coding sequence 
fragments can be generated by treating a double stranded PCR fragment of a NOVX coding 

20 sequence with a nuclease under conditions wherein nicking occurs only about once per 
molecule, denaturing the double stranded DNA, renaturing the DNA to form 
double-stranded DNA that can include sense/antisense pairs from different nicked products, 
removing single, stranded portions from reformed duplexes by treatment with S\ nuclease, 
and ligating the resulting fragment library into an expression vector. By this method, 

25 expression libraries can be derived which encodes N-terminal and internal fragments of 
various sizes of the NOVX proteins. 

Various techniques are known in the art for screening gene products of 
combinatorial libraries made by point mutations or truncation, and for screening cDNA 
libraries for gene products having a selected property. Such techniques are adaptable for 

30 rapid screening of the gene libraries generated by the combinatorial mutagenesis of NOVX 
proteins. The most widely used techniques, which are amenable to high throughput 
analysis, for screening large gene libraries typically include cloning the gene library into 
replicable expression vectors, transforming appropriate cells with the resulting library of 
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vectors, and expressing tne comomatonai genes under coftditfofisin^hffc'Jt'det^ctit&rbfl^ 1 
desired activity facilitates isolation of the vector encoding the gene whose product was 
detected. Recursive ensemble mutagenesis (REM), a new technique that enhances the 
frequency of functional mutants in the libraries, can be used in combination with the 
5 screening assays to identify NOVX variants. See, e.g., Arkin and Yourvan, 1992. Proc. 
Natl. Acad. Sci. USA 89: 7811-7815; Delgrave, et al 9 1993. Protein Engineering 
6:327-331. 

Anti-NOVX Antibodies 

Included in the invention are antibodies to NOVX proteins, or fragments of NOVX 
10 proteins. The term "antibody" as used herein refers to immunoglobulin molecules and 
immunologically active portions of immunoglobulin (Ig) molecules, i.e., molecules that 
contain an antigen binding site that specifically binds (immunoreacts with) an antigen. 
Such antibodies include, but are not limited to, polyclonal, monoclonal, chimeric, single 
chain, F ab , F a t>> and F( ab -)2 fragments, and an F a b expression library. In general, antibody 
15 molecules obtained from humans relates to any of the classes IgG, IgM, IgA, IgE and IgD, 
which differ from one another by the nature of the heavy chain present in the molecule. 
Certain classes have subclasses as well, such as IgGi, IgG2, and others. Furthermore, in 
humans, the light chain may be a kappa chain or a lambda chain. Reference herein to 
antibodies includes a reference to all such classes, subclasses and types of human antibody 
20 species. 

An isolated protein of the invention intended to serve as an antigen, or a portion or 
fragment thereof, can be used as an immunogen to generate antibodies that 
immunospecifically bind the antigen, using standard techniques for polyclonal and 
monoclonal antibody preparation. The full-length protein can be used or, alternatively, the 

25 invention provides antigenic peptide fragments of the antigen for use as immunogens. An 
antigenic peptide fragment comprises at least 6 amino acid residues of the amino acid 
sequence of the full length protein, such as an amino acid sequence of SEQ ID NO:2n, 
wherein n is an integer between 1 and 124, and encompasses an epitope thereof such that 
an antibody raised against the peptide forms a specific immune complex with the full 

30 length protein or with any fragment that contains the epitope. Preferably, the antigenic 
peptide comprises at least 10 amino acid residues, or at least 15 amino acid residues, or at 
least 20 amino acid residues, or at least 30 amino acid residues. Preferred epitopes 
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encompassed by the antigenic peptide are regions of the iffbtdd4h*af^Plfe£Ited onPitfe 
surface; commonly these are hydrophilic regions. 

In certain embodiments of the invention, at least one epitope encompassed by the 
antigenic peptide is a region of NOVX that is located on the surface of the protein, e.g., a 
5 hydrophilic region. A hydrophobicity analysis of the human NOVX protein sequence will 
indicate which regions of a NOVX polypeptide are particularly hydrophilic and, therefore, 
are likely to encode surface residues useful for targeting antibody production. As a means 
for targeting antibody production, hydropathy plots showing regions of hydrophilicity and 
hydrophobicity may be generated by any method well known in the art, including, for 

10 example, the Kyte Doolittle or the Hopp Woods methods, either with or without Fourier 
transformation. See, e.g., Hopp and Woods, 1981, Proc. Nat. Acad. ScL USA 78: 
3824-3828; Kyte and Doolittle 1982, 7. Mol Biol 157: 105-142, each incorporated herein 
by reference in their entirety. Antibodies that are specific for one or more domains within 
an antigenic protein, or derivatives, fragments, analogs or homologs thereof, are also 

15 provided herein. 

The term "epitope" includes any protein determinant capable of specific binding to 
an immunoglobulin or T-cell receptor. Epitopic determinants usually consist of chemically 
active surface groupings of molecules such as amino acids or sugar side chains and usually 
have specific three dimensional structural characteristics, as well as specific charge 

20 characteristics. A NOVX polypeptide or a fragment thereof comprises at least one antigenic 
epitope. An anti-NOVX antibody of the present invention is said to specifically bind to 
antigen NOVX when the equilibrium binding constant (K D ) is <1 \xM, preferably < 100 
nM, more preferably < 10 nM, and most preferably < 100 pM to about 1 pM, as measured 
by assays such as radioligand binding assays or similar assays known to those skilled in the 

25 art. 

A protein of the invention, or a derivative, fragment, analog, homolog or ortholog 
thereof, may be utilized as an immunogen in the generation of antibodies that 
immunospecifically bind these protein components. 

Various procedures known within the art may be used for the production of 
30 polyclonal or monoclonal antibodies directed against a protein of the invention, or against 
derivatives, fragments, analogs homologs or orthologs thereof (see, for example, 
Antibodies: A Laboratory Manual, Harlow E, and Lane D, 1988, Cold Spring Harbor 
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Laboratory Press, Cold Spring Harbor, NY, incorporated ihfefeiri* by ffl&M^SS&w&dP 
these antibodies are discussed below. 

Polyclonal Antibodies 

For the production of polyclonal antibodies, various suitable host animals (e.g., 

5 rabbit, goat, mouse or other mammal) may be immunized by one or more injections with 
the native protein, a synthetic variant thereof, or a derivative of the foregoing. An 
appropriate immunogenic preparation can contain, for example, the naturally occurring 
immunogenic protein, a chemically synthesized polypeptide representing the immunogenic 
protein, or a recombinant^ expressed immunogenic protein. Furthermore, the protein may 

10 be conjugated to a second protein known to be immunogenic in the mammal being 
immunized. Examples of such immunogenic proteins include but are not limited to 
keyhole limpet hemocyanin, serum albumin, bovine thyroglobulin, and soybean trypsin 
inhibitor. The preparation can further include an adjuvant. Various adjuvants used to 
increase the immunological response include, but are not limited to, Freund's (complete and 
. 15 incomplete), mineral gels {e.g., aluminum hydroxide), surface active substances (e.g., 
lysolecithin, pluronic polyols/polyanions, peptides, oil emulsions, dinitrophenol, efc), 
adjuvants usable in humans such as Bacille Calmette-Guerin and Corynebacterium parvum, 
or similar immunostimulatory agents. Additional examples of adjuvants which can be 
employed include MPL-TDM adjuvant (monophosphoryl Lipid A, synthetic trehalose 

20 dicorynomycolate). 

The polyclonal antibody molecules directed against the immunogenic protein can be 
isolated from the mammal (e.g., from the blood) and further purified by well known 
techniques, such as affinity chromatography using protein A or protein G, which provide 
primarily the IgG fraction of immune serum. Subsequently, or alternatively, the specific 

25 antigen which is the target of the immunoglobulin sought, or an epitope thereof, may be 
immobilized on a column to purify the immune specific antibody by immunoaffinity 
chromatography. Purification of immunoglobulins is discussed, for example, by D. 
Wilkinson (The Scientist, published by The Scientist, Inc., Philadelphia PA, Vol. 14, No. 8 
(April 17, 2000), pp. 25-28). 

30 Monoclonal Antibodies 

The term "monoclonal antibody" (MAb) or "monoclonal antibody composition", as 
used herein, refers to a population of antibody molecules that contain only one molecular 
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species of antibody molecule consisting of a unique light chain gene product and a unique 
heavy chain gene product. In particular, the complementarity determining regions (CDRs) 
of the monoclonal antibody are identical in all the molecules of the population. MAbs thus 
contain an antigen binding site capable of immunoreacting with a particular epitope of the 

5 antigen characterized by a unique binding affinity for it. 

Monoclonal antibodies can be prepared using hybridoma methods, such as those 
described by Kohler and Milstein, Nature, 256:495 (1975). In a hybridoma method, a 
mouse, hamster, or other appropriate host animal, is typically immunized with an 
immunizing agent to elicit lymphocytes that produce or are capable of producing antibodies 

10 that will specifically bind to the immunizing agent. Alternatively, the lymphocytes can be 
immunized in vitro. 

The immunizing agent will typically include the protein antigen, a fragment thereof 
or a fusion protein thereof. Generally, either peripheral blood lymphocytes are used if cells 
of human origin are desired, or spleen cells or lymph node cells are used if non-human 

15 mammalian sources are desired. The lymphocytes are then fused with an immortalized cell 
line using a suitable fusing agent, such as polyethylene glycol, to form a hybridoma cell 
(Goding, Monoclonal Antibodies: Principles and Practice , Academic Press, (1986) pp. 
59-103). Immortalized cell lines are usually transformed mammalian cells, particularly 
myeloma cells of rodent, bovine and human origin. Usually, rat or mouse myeloma cell 

20 lines are employed. The hybridoma cells can be cultured in a suitable culture medium that 
preferably contains one or more substances that inhibit the growth or survival of the 
unfiised, immortalized cells. For example, if the parental cells lack the enzyme 
hypoxanthine guanine phosphoribosyl transferase (HGPRT or HPRT), the culture medium 
for the hybridomas typically will include hypoxanthine, aminopterin, and thymidine ("HAT 

25 medium"), which substances prevent the growth of HGPRT-deficient cells. 

Preferred immortalized cell lines are those that fuse efficiently, support stable high 
level expression of antibody by the selected antibody-producing cells, and are sensitive to a 
medium such as HAT medium. More preferred immortalized cell lines are murine 
myeloma lines, which can be obtained, for instance, from the Salk Institute Cell 

30 Distribution Center, San Diego, California and the American Type Culture Collection, 
Manassas, Virginia. Human myeloma and mouse-human heteromyeloma cell lines also 
have been described for the production of human monoclonal antibodies (Kozbor, J. 
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Immunol., 133:3001 (1984); Brbdeur et al., Monoclonal Antibody Production Techniques 
and Applications, Marcel Dekker, Inc., New York, (1987) pp. 51-63). 

The culture medium in which the hybridoma cells are cultured can then be assayed 
for the presence of monoclonal antibodies directed against the antigen. Preferably, the 
5 binding specificity of monoclonal antibodies produced by the hybridoma cells is 
determined by immunoprecipitation or by an in vitro binding assay, such as 
radioimmunoassay (RIA) or enzyme-linked immunoabsorbent assay (ELISA). Such 
techniques and assays are known in the art. The binding affinity of the monoclonal 
antibody can, for example, be determined by the Scatchard analysis of Munson and Pollard, 

10 Anal. Biochem., 107:220 (1980). It is an objective, especially important in therapeutic 
applications of monoclonal antibodies, to identify antibodies having a high degree of 
specificity and a high binding affinity for the target antigen. 

After the desired hybridoma cells are identified, the clones can be subcloned by 
limiting dilution procedures and grown by standard methods (Goding,1986). Suitable 

15 culture media for this purpose include, for example, Dulbecco's Modified Eagle's Medium 
and RPMM640 medium. Alternatively, the hybridoma cells can be grown in vivo as 
ascites in a mammal. 

The monoclonal antibodies secreted by the subclones can be isolated or purified 
from the culture medium or ascites fluid by conventional immunoglobulin purification 

20 procedures such as, for example, protein A-Sepharose, hydroxylapatite chromatography, 
gel electrophoresis, dialysis, or affinity chromatography. 

The monoclonal antibodies can also be made by recombinant DNA methods, such 
as those described in U.S. Patent No. 4,816,567. DNA encoding the monoclonal antibodies 
of the invention can be readily isolated and sequenced using conventional procedures (e.g., 

25 by using oligonucleotide probes that are capable of binding specifically to genes encoding 
the heavy and light chains of murine antibodies). The hybridoma cells of the invention 
serve as a preferred source of such DNA. Once isolated, the DNA can be placed into 
expression vectors, which are then transfected into host cells such as simian COS cells, 
Chinese hamster ovary (CHO) cells, or myeloma cells that do not otherwise produce 

30 immunoglobulin protein, to obtain the synthesis of monoclonal antibodies in the 

recombinant host cells. The DNA also can be modified, for example, by substituting the 
coding sequence for human heavy and light chain constant domains in place of the 
homologous murine sequences (U.S. Patent No. 4,816,567; Morrison, Nature 368, 812-13 
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(1994)) or by covalently joining to the immunoglobulin coding sequence all or part of the 
coding sequence for a non-immunoglobulin polypeptide. Such a non-immunoglobulin 
polypeptide can be substituted for the constant domains of an antibody of the invention, or 
can be substituted for the variable domains of one antigen-combining site of an antibody of 
5 the invention to create a chimeric bivalent antibody. 

Humanized Antibodies 

The antibodies directed against the protein antigens of the invention can further 
comprise humanized antibodies or human antibodies. These antibodies are suitable for 
administration to humans without engendering an immune response by the human against 
10 the administered immunoglobulin. Humanized forms of antibodies are chimeric 

immunoglobulins, immunoglobulin chains or fragments thereof (such as Fv, Fab, Fab', 
F(ab') 2 or other antigen-binding subsequences of antibodies) that are principally comprised 
of the sequence of a human immunoglobulin, and contain minimal sequence derived from a 
non-human immunoglobulin. Humanization can be performed following the method of 
15 Winter and co-workers (Jones et al., Nature, 321 :522-525 (1986); Riechmann et al., Nature, 
332:323-327 (1988); Verhoeyen et al., Science, 239:1534-1536 (1988)), by substituting 
rodent CDRs or CDR sequences for the corresponding sequences of a human antibody. 
(See also U.S. Patent No. 5,225,539.) In some instances, Fv framework residues of the 
human immunoglobulin are replaced by corresponding non-human residues. Humanized 
20 antibodies can also comprise residues which are found neither in the recipient antibody nor 
in the imported CDR or framework sequences. In general, the humanized antibody will 
comprise substantially all of at least one, and typically two, variable domains, in which all 
or substantially all of the CDR regions correspond to those of a non-human 
immunoglobulin and all or substantially all of the framework regions are those of a human 
25 immunoglobulin consensus sequence. The humanized antibody optimally also will 

comprise at least a portion of an immunoglobulin constant region (Fc), typically that of a 
human immunoglobulin (Jones et al., 1986; Riechmann et al., 1988; and Presta, Curr. Op. 
Struct. Biol., 2:593-596 (1992)). 
Human Antibodies 

30 Fully human antibodies essentially relate to antibody molecules in which the entire 

sequence of both the light chain and the heavy chain, including the CDRs, arise from 
human genes. Such antibodies are termed "human antibodies", or "fully human antibodies' 
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herein. Human monoclonal antibodies can be prepared by^fe-trfbiiia^tfecfifliytret* tMhhmM 
B-cell hybridoma technique (see Kozbor, et al., 1983 Immunol Today 4: 72) and the EBV 
hybridoma technique to produce human monoclonal antibodies (see Cole, et al., 1985 In: 
Monoclonal Antibodies and Cancer Therapy, Alan R. Liss, Inc., pp. 77-96). Human 
5 monoclonal antibodies may be utilized in the practice of the present invention and may be 
produced by using human hybridomas (see Cote, et al, 1983. Proc Natl Acad Sci USA 80: 
2026-2030) or by transforming human B-cells with Epstein Ban* Virus in vitro (see Cole, et 
al., 1985 In: Monoclonal Antibodies and Cancer Therapy, Alan R. Liss, Inc., pp. 
77-96). 

10 In addition, human antibodies can also be produced using additional techniques, 

including phage display libraries (Hoogenboom and Winter, J. Mol. Biol., 227:381 (1991); 
Marks et al., J. Mol. Biol., 222:581 (1991)). Similarly, human antibodies can be made by 
introducing human immunoglobulin loci into transgenic animals, e.g., mice in which the 
endogenous immunoglobulin genes have been partially or completely inactivated. Upon 

15 challenge, human antibody production is observed, which closely resembles that seen in 
humans in all respects, including gene rearrangement, assembly, and antibody repertoire. 
This approach is described, for example, in U.S. Patent Nos. 5,545,807; 5,545,806; 
5,569,825; 5,625,126; 5,633,425; 5,661,016, and in Marks et al. (Bio/Technology 10, 
779-783 (1992)); Lonberg et al. (Nature 368 856-859 (1994)); Monison ( Nature 368, 

20 812-13 (1994)); Fishwild et al,( Nature Biotechnology 14, 845-51 (1996)); Neuberger 
(Nature Biotechnology 14, 826 (1996)); and Lonberg and Huszar (Intern. Rev. Immunol. 
13 65-93 (1995)). 

Human antibodies may additionally be produced using transgenic nonhuman 
animals which are modified so as to produce fully human antibodies rather than the 

25 animal's endogenous antibodies in response to challenge by an antigen. (See PCT 
publication WO94/02602). The endogenous genes encoding the heavy and light 
immunoglobulin chains in the nonhuman host have been incapacitated, and active loci 
encoding human heavy and light chain immunoglobulins are inserted into the host's 
genome. The human genes are incorporated, for example, using yeast artificial 

30 chromosomes containing the requisite human DNA segments. An animal which provides 
all the desired modifications is then obtained as progeny by crossbreeding intermediate 
transgenic animals containing fewer than the full complement of the modifications. The 
preferred embodiment of such a nonhuman animal is a mouse, and is termed the 
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Xenomouse™ as disclosed in PCT publications WO 96/3^ p aftdW6 r f^if09fe.^^ 3 
animal produces B cells which secrete fully human immunoglobulins. The antibodies can 
be obtained directly from the animal after immunization with an ixnmunogen of interest, as, 
for example, a preparation of a polyclonal antibody, or alternatively from immortalized B 

5 cells derived from the animal, such as hybridomas producing monoclonal antibodies. 

Additionally, the genes encoding the immunoglobulins with human variable regions can be 
recovered and expressed to obtain the antibodies directly, or can be further modified to 
obtain analogs of antibodies such as, for example, single chain Fv molecules. 

An example of a method of producing a nonhuman host, exemplified as a mouse, 

10 lacking expression of an endogenous immunoglobulin heavy chain is disclosed in U.S. 
Patent No. 5,939,598. It can be obtained by a method including deleting the J segment 
genes from at least one endogenous heavy chain locus in an embryonic stem cell to prevent 
rearrangement of the locus and to prevent formation of a transcript of a rearranged 
immunoglobulin heavy chain locus, the deletion being effected by a targeting vector 

15 containing a gene encoding a selectable marker; and producing from the embryonic stem 
cell a transgenic mouse whose somatic and germ cells contain the gene encoding the 
selectable marker. 

A method for producing an antibody of interest, such as a human antibody, is 
disclosed in U.S. Patent No. 5,916,771 . It includes introducing an expression vector that 

20 contains a nucleotide sequence encoding a heavy chain into one mammalian host cell in 
culture, introducing an expression vector containing a nucleotide sequence encoding a light 
chain into another mammalian host cell, and fusing the two cells to form a hybrid cell. The 
hybrid cell expresses an antibody containing the heavy chain and the light chain. 

In a further improvement on this procedure, a method for identifying a clinically 

25 relevant epitope on an immunogen, and a correlative method for selecting an antibody that 
binds immunospecifically to the relevant epitope with high affinity, are disclosed in PCT 
publication WO 99/53049. 

Fab Fragments and Single Chain Antibodies 

According to the invention, techniques can be adapted for the production of 
30 single-chain antibodies specific to an antigenic protein of the invention (see e.g., U.S. 
Patent No. 4,946,778). In addition, methods can be adapted for the construction of F ab 
expression libraries (see e.g., Huse, et al., 1989 Science 246: 1275-1281) to allow rapid and 
effective identification of monoclonal F ab fragments with the desired specificity for a 
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protein or derivatives, fragments, analogs or homologs theSre&f. Mta*a^ , ff^eirfe'lhaP 
contain the idiotypes to a protein antigen may be produced by techniques known in the art 
including, but not limited to: (i) an F W 2 fragment produced by pepsin digestion of an 
antibody molecule; (ii) an F a b fragment generated by reducing the disulfide bridges of an 
5 F ( ab-)2 fragment; (iii) an Fab fragment generated by the treatment of the antibody molecule 
with papain and a reducing agent and (iv) F v fragments. 

Bispecific Antibodies 

Bispecific antibodies are monoclonal, preferably human or humanized, antibodies 
that have binding specificities for at least two different antigens. In the present case, one of 
10 the binding specificities is for an antigenic protein of the invention. The second binding 
target is any other antigen, and advantageously is a cell-surface protein or receptor or 
receptor subunit. 

Methods for making bispecific antibodies are known in the art. Traditionally, the 
recombinant production of bispecific antibodies is based on the co-expression of two 

15 immunoglobulin heavy-chain/light-chain pairs, where the two heavy chains have different 
specificities (Milstein and Cuello, Nature, 305:537-539 (1983)). Because of the random 
assortment of immunoglobulin heavy and light chains, these hybridomas (quadromas) 
produce a potential mixture of ten different antibody molecules, of which only one has the 
correct bispecific structure. The purification of the correct molecule is usually 

20 accomplished by affinity chromatography steps. Similar procedures are disclosed in WO 
93/08829, published 13 May 1993, and in Traunecker et al., EMBO J., 10:3655-3659 
(1991). 

Antibody variable domains with the desired binding specificities (antibody-antigen 
combining sites) can be fused to immunoglobulin constant domain sequences. The fusion 

25 preferably is with an immunoglobulin heavy-chain constant domain, comprising at least 
part of the hinge, CH2, and CH3 regions. It is preferred to have the first heavy-chain 
constant region (CHI) containing the site necessary for light-chain binding present in at 
least one of the fusions. DNAs encoding the immunoglobulin heavy-chain fusions and, if 
desired, the immunoglobulin light chain, are inserted into separate expression vectors, and 

30 are co-transfected into a suitable host organism. For further details of generating bispecific 
antibodies see, for example, Suresh et al., Methods in Enzymology, 121:210 (1986). 

According to another approach described in WO 96/2701 1, the interface between a 
pair of antibody molecules can be engineered to maximize the percentage of heterodimers 
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which are recovered from recombinant cell culture. The fTrtiteried iMj^fcfc^rfipSs&s«at 
least a part of the CH3 region of an antibody constant domain, h this method, one or more 
small amino acid side chains from the interface of the first antibody molecule are replaced 
with larger side chains (e.g. tyrosine or tryptophan). Compensatory "cavities" of identical 

5 or similar size to the large side chain(s) are created on the interface of the second antibody 
molecule by replacing large amino acid side chains with smaller ones (e.g. alanine or 
threonine). This provides a mechanism for increasing the yield of the heterodimer over 
other unwanted end-products such as homodimers. 

Bispecific antibodies can be prepared as full length antibodies or antibody 

10 fragments (e.g. F(ab') 2 bispecific antibodies). Techniques for generating bispecific 

antibodies from antibody fragments have been described in the literature. For example, 
bispecific antibodies can be prepared using chemical linkage. Brennan et al., Science 
229:81 (1985) describe a procedure wherein intact antibodies are proteolytically cleaved to 
generate F(ab , )2 fragments. These fragments are reduced in the presence of the di thiol 

15 complexing agent sodium arsenite to stabilize vicinal dithiols and prevent intermolecular 
disulfide formation. The Fab' fragments generated are then converted to thionitrobenzoate 
(TNB) derivatives. One of the Fab'-TNB derivatives is then reconverted to the Fab'-thiol 
by reduction with mercaptoethylamine and is mixed with an equimolar amount of the other 
Fab'-TNB derivative to form the bispecific antibody. The bispecific antibodies produced 

20 can be used as agents for the selective immobilization of enzymes. 

Additionally, Fab' fragments can be directly recovered from E. coli and chemically 
coupled to form bispecific antibodies. Shalaby et a]., J. Exp. Med. 175:217-225 (1992) 
describe the production of a fully humanized bispecific antibody F(ab')2 molecule. Each 
Fab' fragment was separately secreted from E. coli and subjected to directed chemical 

25 coupling in vitro to form the bispecific antibody. The bispecific antibody thus formed was 
able to bind to cells overexpressing the ErbB2 receptor and normal human T cells, as well 
as trigger the lytic activity of human cytotoxic lymphocytes against human breast tumor 
targets. 

Various techniques for making and isolating bispecific antibody fragments directly 
30 from recombinant cell culture have also been described. For example, bispecific antibodies 
have been produced using leucine zippers. Kostelny et al., J. Immunol. 148(5):1547-1553 
(1992). The leucine zipper peptides from the Fos and Jun proteins were linked to the Fab' 
portions of two different antibodies by gene fusion. The antibody homodimers were 
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reduced at the hinge region to form monomers and then #oydfze , d , t<ST^M«lfie , 'ary6Bbd^ 
heterodimers. This method can also be utilized for the production of antibody homodimers. 
The "diabody" technology described by Hollinger et aL, Proc. Natl. Acad. Sci. USA 
90:6444-6448 (1993) has provided an alternative mechanism for making bispecific 
antibody fragments. The fragments comprise a heavy-chain variable domain (V H ) 
connected to a light-chain variable domain (VO by a linker which is too short to allow 
pairing between the two domains on the same chain. Accordingly, the V H and V L domains 
of one fragment are forced to pair with the complementary V L and V H domains of another 
fragment, thereby forming two antigen-binding sites. Another strategy for making 
bispecific antibody fragments by the use of single-chain Fv (sFv) dimers has also been 
reported. See, Gruber et aL, J. Immunol. 152:5368 (1994). 

Antibodies with more than two valencies are contemplated. For example, 
trispecific antibodies can be prepared. Tutt et aL, J. Immunol. 147:60 (1991). 

Exemplary bispecific antibodies can bind to two different epitopes, at least one of 
which originates in the protein antigen of the invention. Alternatively, an anti-antigenic 
arm of an immunoglobulin molecule can be combined with an ami which binds to a 
triggering molecule on a leukocyte such as a T-cell receptor molecule (e.g. CD2, CD3, 
CD28, or B7), or Fc receptors for IgG (FcyR), such as FcyRI (CD64), FcyRII (CD32) and 
FcyRIII (CD 16) so as to focus cellular defense mechanisms to the cell expressing the 
particular antigen. Bispecific antibodies can also be used to direct cytotoxic agents to cells 
which express a particular antigen. These antibodies possess an antigen-binding arm and 
an arm which binds a cytotoxic agent or a radionuclide chelator, such as EOTUBE, DPTA, 
DOTA, or TETA. Another bispecific antibody of interest binds the protein antigen 
described herein and further binds tissue factor (TF). 

Heteroconjugate Antibodies 

Heteroconjugate antibodies are also within the scope of the present invention. 
Heteroconjugate antibodies are composed of two covalently joined antibodies. Such 
antibodies have, for example, been proposed to target immune system cells to unwanted 
cells (U.S. Patent No. 4,676,980), and for treatment of HIV infection (WO 91/00360; WO 
92/200373; EP 03089). It is contemplated that the antibodies can be prepared in vitro using 
known methods in synthetic protein chemistry, including those involving crosslinking 
agents. For example, immunotoxins can be constructed using a disulfide exchange reaction 
or by forming a thioether bond. Examples of suitable reagents for this purpose include 
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iminothiolate and methyl-4-mercaptobutyrimidate and th6se!^&l6slfedr^ 
Patent No. 4,676,980. 

Effector Function Engineering 

It can be desirable to modify the antibody of the invention with respect to effector 
5 function, so as to enhance, e.g., the effectiveness of the antibody in treating cancer. For 
example, cysteine residue(s) can be introduced into the Fc region,.thereby allowing 
interchain disulfide bond formation in this region. The homodimeric antibody thus 
generated can have improved internalization capability and/or increased 
complement-mediated cell killing and antibody-dependent cellular cytotoxicity (ADCC). 
10 See Caron et ah, J. Exp Med., 176: 1 191-1 195 (1992) and Shopes, J. Immunol., 148: 

2918-2922 (1992). Homodimeric antibodies with enhanced anti-tumor activity can also be 
prepared using heterobifunctional cross-linkers as described in Wolff et al. Cancer 
Research, 53: 2560-2565 (1993). Alternatively, an antibody can be engineered that has 
dual Fc regions and can thereby have enhanced complement lysis and ADCC capabilities. 
15 See Stevenson et al., Anti-Cancer Drug Design, 3: 219-230 (1989). 

Immunoconjugates 

The invention also pertains to immunoconjugates comprising an antibody 
conjugated to a cytotoxic agent such as a chemotherapeutic agent, toxin (e.g., an 
enzymatically active toxin of bacterial, fungal, plant, or animal origin, or fragments 

20 thereof), or a radioactive isotope (i.e. , a radioconjugate). 

Chemotherapeutic agents useful in the generation of such immunoconjugates have 
been described above. Enzymatically active toxins and fragments thereof that can be used 
include diphtheria A chain, nonbinding active fragments of diphtheria toxin, exotoxin A 
chain (from Pseudomonas aeruginosa), ricin A chain, abrin A chain, modeccin A chain, 

25 alpha-sarcin, Aleurites fordii proteins, dianthin proteins, Phytolaca americana proteins 
(PAPI, PAPII, and PAP-S), momordica charantia inhibitor, curcin, crotin, sapaonaria 
officinalis inhibitor, gelonin, mitogellin, restrictocin, phenomycin, enomycin, and the 
tricothecenes. A variety of radionuclides are available for the production of 
radioconjugated antibodies. Examples include 212 Bi, l3, 1, 131 In, 90 Y, and 186 Re. 

30 Conjugates of the antibody and cytotoxic agent are made using a variety of 

bifunctional protein-coupling agents such as N-succinimidyl-3-(2-pyridyldithiol) 
propionate (SPDP), iminothiolane (TT), bifunctional derivatives of imidoesters (such as 
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dimethyl adipimidate HCL), active esters (such as disuccfifiifrticfyf sM^ 
(such as glutareldehyde), bis-azido compounds (such as bis (p-azidobenzoyl) 
hexanediamine), bis-diazonium derivatives (such as 
bis-(p-diazoniumbenzoyl)-ethylenediamine), diisocyanates (such as tolyene 
5 2,6-diisocyanate), and bis-active fluorine compounds (such as 

l,5-difluoro-2,4-dinitrobenzene). For example, a ricin immunotoxin can be prepared as 
described in Vitetta et al., Science, 238 : 1098 (1987). Carbon-14-]abeled 
l-isothiocyanatobenzyl-3-methyldiethylene triaminepentaacetic acid (MX-DTPA)is an 
exemplary chelating agent for conjugation of radionucleotide to the antibody. See 

10 WO94/11026. 

In another embodiment, the antibody can be conjugated to a "receptor" (such 
streptavidtn) for utilization in tumor pretargeting wherein the antibody-receptor conjugate 
is administered to the patient, followed by removal of unbound conjugate from the 
circulation using a clearing agent and then administration of a "ligand" (e.g., avidin) that is 

15 in turn conjugated to a cytotoxic agent. 

Immunoliposomes 

The antibodies disclosed herein can also be formulated as immunoliposomes. 
Liposomes containing the antibody are prepared by methods known in the art, such as 
described in Epstein et al., Proc. Natl. Acad. Sci. USA, 82: 3688 (1985); Hwang et al., 

20 Proc. Natl Acad. Sci. USA, 77: 4030 (1980); and U.S. Pat. Nos. 4,485,045 and 4,544,545. 
Liposomes with enhanced circulation time are disclosed in U.S. Patent No. 5,013,556. 

Particularly useful liposomes can be generated by the reverse-phase evaporation 
method with a lipid composition comprising phosphatidylcholine, cholesterol, and 
PEG-derivatized phosphatidylethanolamine (PEG-PE). Liposomes are extruded through 

25 filters of defined pore size to yield liposomes with the desired diameter. Fab' fragments of 
the antibody of the present invention can be conjugated to the liposomes as described in 
Martin et akJL Biol. Chem., 257: 286-288 (1982) via a disulfide-interchange reaction. A 
chemotherapeutic agent (such as Doxorubicin) is optionally contained within the liposome. 
See Gabizon etal, J. National Cancer Inst, 81(19): 1484 (1989). 

30 Diagnostic Applications of Antibodies Directed Against the Proteins of the 

Invention 

In one embodiment, methods for the screening of antibodies that possess the desired 
specificity include, but are not limited to, enzyme linked immunosorbent assay (ELIS A) 
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and other immunologically mediated techniques known xtfitffifi fhd aftf §pecifi£ - 
embodiment, selection of antibodies that are specific to a particular domain of an NOVX 
protein is facilitated by generation of hybridomas that bind to the fragment of an NOVX 
protein possessing such a domain. Thus, antibodies that are specific for a desired domain 
5 within an NOVX protein, or derivatives, fragments, analogs or homologs thereof, are also 
provided herein. 

Antibodies directed against a NOVX protein of the invention may be used in 
methods known within the art relating to the localization and/or quantitation of a NOVX 
protein (e.g., for use in measuring levels of the NOVX protein within appropriate 
10 physiological samples, for use in diagnostic methods, for use in imaging the protein, and 
the like). In a given embodiment, antibodies specific to a NOVX protein, or derivative, 
fragment, analog or homolog thereof, that contain the antibody derived antigen binding 
domain, are utilized as pharmacologically active compounds (referred to hereinafter as 
"Therapeutics"). 

15 An antibody specific for a NOVX protein of the invention (e.g., a monoclonal 

antibody or a polyclonal antibody) can be used to isolate a NOVX polypeptide by standard 
techniques, such as immunoaffinity, chromatography or immunoprecipitation. An antibody 
to a NOVX polypeptide can facilitate, the purification of a natural NOVX antigen from 
cells, or of a recombinantly produced NOVX antigen expressed in host cells. Moreover, 

20 such an anti-NOVX antibody can be used to detect the antigenic NOVX protein (e.g., in a 
cellular lysate or cell supernatant) in order to evaluate the abundance and pattern of 
expression of the antigenic NOVX protein. Antibodies directed against a NOVX protein 
can be used diagnostically to monitor protein levels in tissue as part of a clinical testing 
procedure, e.g., to, for example, determine the efficacy of a given treatment regimen. 

25 Detection can be facilitated by coupling (Le., physically linking) the antibody to a 
detectable substance. Examples of detectable substances include various enzymes, 
prosthetic groups, fluorescent materials, luminescent materials, bioluminescent materials, 
and radioactive materials. Examples of suitable enzymes include horseradish peroxidase, 
alkaline phosphatase, p-galactosidase, or acetylcholinesterase; examples of suitable 

30 prosthetic group complexes include streptavidin/biotin and avidin/biotin; examples of 
suitable fluorescent materials include umbelliferone, fluorescein, fluorescein 
isothiocyanate, rhodamine, dichlorotriazinylamine fluorescein, dansyl chloride or 
phycoerythrin; an example of a luminescent material includes luminol; examples of 
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bioluminescent materials include luciferase, luciferin, ancria^udrfh^htf^^Bfile^bf 
suitable radioactive material include l25 1, 131 1, 35 S or 3 H. 

Antibody Therapeutics 

Antibodies of the invention, including polyclonal, monoclonal, humanized and fully 

5 human antibodies, may used as therapeutic agents. Such agents will generally be employed 
to treat or prevent a disease or pathology in a subject An antibody preparation, preferably 
one having high specificity and high affinity for its target antigen, is administered to the 
subject and will generally have an effect due to its binding with the target. Such an effect 
may be one of two kinds, depending on the specific nature of the interaction between the 

10 given antibody molecule and the target antigen in question. In the first instance, 

administration of the antibody may abrogate or inhibit the binding of the target with an 
endogenous ligand to which it naturally binds. In this case, the antibody binds to the target 
and masks a binding site of the naturally occurring ligand, wherein the ligand serves as an 
effector molecule. Thus the receptor mediates a signal transduction pathway for which 

15 ligand is responsible. 

Alternatively, the effect may be one in which the antibody elicits a physiological 
result by virtue of binding to an effector binding site on the target molecule. In this case 
the target, a receptor having an endogenous ligand which may be absent or defective in the 
disease or pathology, binds the antibody as a surrogate effector ligand, initiating a 

20 receptor-based signal transduction event by the receptor. 

A therapeutically effective amount of an antibody of the invention relates generally 
to the amount needed to achieve a therapeutic objective. As noted above, this may be a 
binding interaction between the antibody and its target antigen that, in certain cases, 
interferes with the functioning of the target, and in other cases, promotes a physiological 

25 response. The amount required to be administered will furthermore depend on the binding 
affinity of the antibody for its specific antigen, and will also depend on the rate at which an 
administered antibody is depleted from the free volume other subject to which it is 
administered. Common ranges for therapeutically effective dosing of an antibody or 
antibody fragment of the invention may be, by way of nonlimiting example, from about 0.1 

30 mg/kg body weight to about 50 mg/kg body weight. Common dosing frequencies may 
range, for example, from twice daily to once a week. 
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Pharmaceutical Compositions of Antibodies 

Antibodies specifically binding a protein of the invention, as well as other 
molecules identified by the screening assays disclosed herein, can be administered for the 
treatment of various disorders in the form of pharmaceutical compositions. Principles and 

5 considerations involved in preparing such compositions, as well as guidance in the choice 
of components are provided, for example, in Remington : The Science And Practice Of 
Pharmacy 19th ed. (Alfonso R. Gennaro, et al., editors) Mack Pub. Co., Easton, Pa. : 1995; 
Drug Absorption Enhancement : Concepts, Possibilities, Limitations, And Trends, 
Harwood Academic Publishers, Langhorne, Pa., 1994; and Peptide And Protein Drug 

10 Delivery (Advances In Parenteral Sciences, Vol. 4), 1991, M. Dekker, New York. 

If the antigenic protein is intracellular and whole antibodies are used as inhibitors, 
internalizing antibodies are preferred. However, liposomes can also be used to deliver the 
antibody, or an antibody fragment, into cells. Where antibody fragments are used, the 
smallest inhibitory fragment that specifically binds to the binding domain of the target 

15 protein is preferred. For example, based upon the variable-region sequences of an 

antibody, peptide molecules can be designed that retain the ability to bind the target protein 
sequence. Such peptides can be synthesized chemically and/or produced by recombinant 
DNA technology. See, e.g., Marasco et al., Proc. Natl. Acad. Sci. USA, 90: 7889-7893 
(1993). The formulation herein can also contain more than one active compound as 

20 necessary for the particular indication being treated, preferably those with complementary 
activities that do not adversely affect each other. Alternatively, or in addition, the 
composition can comprise an agent that enhances its function, such as, for example, a 
cytotoxic agent, cytokine, chemotherapeutic agent, or growth-inhibitory agent. Such 
molecules are suitably present in combination in amounts that are effective for the purpose 

25 intended. 

The active ingredients can also be entrapped in microcapsules prepared, for 
example, by coacervation techniques or by interfacial polymerization, for example, 
hydroxymethylcellulose or gelatin-microcapsules and polymethylmethacrylate) 
microcapsules, respectively, in colloidal drug delivery systems (for example, liposomes, 
30 albumin microspheres, microemulsions, nano-particles, and nanocapsules) or in 
macroemulsions. 

The formulations to be used for in vivo administration must be sterile. This is 
readily accomplished by filtration through sterile filtration membranes. 
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Sustained-release preparations can be prepared. SGi&blfc 6tMjfleY(5if 
sustained-release preparations include semipermeable matrices of solid hydrophobic 
polymers containing the antibody, which matrices are in the form of shaped articles, e.g., 
films, or microcapsules. Examples of sustained-release matrices include polyesters, 

5 hydrogels (for example, poly(2-hydroxyethyl-methacrylate), or poly(vinylalcohol)), 
polylactides (U.S. Pat. No. 3,773,919), copolymers of L-glutamic acid and y 
ethyl-L-glutamate, non-degradable ethylene- vinyl acetate, degradable lactic acid-glycolic 
acid copolymers such as the LUPRON DEPOT ™ (injectable microspheres composed of 
lactic acid-glycolic acid copolymer and leuprolide acetate), and 

10 poly-D-(-)-3-hydroxybutyric acid. While polymers such as ethylene-vinyl acetate and 
lactic acid-glycolic acid enable release of molecules for over 100 days, certain hydrogels 
release proteins for shorter time periods. 

ELISA Assay 

An agent for detecting an analyte protein is an antibody capable of binding to an 

15 analyte protein, preferably an antibody with a detectable label. Antibodies can be 

polyclonal, or more preferably, monoclonal. An intact antibody, or a fragment thereof 
(e.g., or F ( ab)2) can be used. The term "labeled", with regard to the probe or antibody, is 
intended to encompass direct labeling of the probe or antibody by coupling (i.<?., physically 
linking) a detectable substance to the probe or antibody, as well as indirect labeling of the 

20 probe or antibody by reactivity with another reagent that is directly labeled. Examples of 
indirect labeling include detection of a primary antibody using a fluorescently-labeled 
secondary antibody and end-labeling of a DNA probe with biotin such that it can be 
detected with fluorescently-labeled streptavidin. The term "biological sample" is intended 
to include tissues, cells and biological fluids isolated from a subject, as well as tissues, cells 

25 and fluids present within a subject. Included within the usage of the term "biological 

sample", therefore, is blood and a fraction or component of blood including blood serum, 
blood plasma, or lymph. That is, the detection method of the invention can be used to 
detect an analyte mRNA, protein, or genomic DNA in a biological sample in vitro as well 
as in vivo. For example, in vitro techniques for detection of an analyte mRNA include 

30 Northern hybridizations and in situ hybridizations. In vitro techniques for detection of an 
analyte protein include enzyme linked immunosorbent assays (ELISAs), Western blots, 
immunoprecipitations, and immunofluorescence. In vitro techniques for detection of an 
analyte genomic DNA include Southern hybridizations. Procedures for conducting 
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immunoassays are described, for example in "ELISA: Theory Ad^^^i^iiSo^ifr 
Molecular Biology", Vol. 42, J. R. Crowther (Ed.) Human Press, Totowa, NJ, 1995; 
"Immunoassay", E. Diamandis and T. Christopoulus, Academic Press, Inc., San Diego, 
CA, 1996; and "Practice and Thory of Enzyme Immunoassays", P. Tijssen, Elsevier 
5 Science Publishers, Amsterdam, 1985. Furthermore, in vivo techniques for detection of an 
analyte protein include introducing into a subject a labeled anti-an analyte protein antibody. 
For example, the antibody can be labeled with a radioactive marker whose presence and 
location in a subject can be detected by standard imaging techniques. 

NOVX Recombinant Expression Vectors and Host Cells 

10 Another aspect of the invention pertains to vectors, preferably expression vectors, 

containing a nucleic acid encoding a NOVX protein, or derivatives, fragments, analogs or 
homologs thereof. As used herein, the term "vector" refers to a nucleic acid molecule 
capable of transporting another nucleic acid to which it has been linked. One type of vector 
is a "plasmid", which refers to a circular double stranded DNA loop into which additional 

15 DNA segments can be ligated. Another type of vector is a viral vector, wherein additional 
DNA segments can be ligated into the viral genome. Certain vectors are capable of 
autonomous replication in a host cell into which they are introduced (e.g., bacterial vectors 
having a bacterial origin of replication and episomal mammalian vectors). Other vectors 
(e.g., non-episomal mammalian vectors) are integrated into the genome of a host cell upon 

20 introduction into the host cell, and thereby are replicated along with the host genome. 

Moreover, certain vectors are capable of directing the expression of genes to which they are 
operatively-linked. Such vectors are referred to herein as "expression vectors". In general, 
expression vectors of utility in recombinant DNA techniques are often in the form of 
plasmids. In the present specification, "plasmid" and "vector" can be used interchangeably 

25 as the plasmid is the most commonly used form of vector. However, the invention is 
intended to include such other forms of expression vectors, such as viral vectors (e.g., 
replication defective retroviruses, adenoviruses and adeno-associated viruses), which serve 
equivalent functions. 

The recombinant expression vectors of the invention comprise a nucleic acid of the 
30 invention in a form suitable for expression of the nucleic acid in a host cell, which means 
that the recombinant expression vectors include one or more regulatory sequences, selected 
on the basis of the host cells to be used for expression, that is operatively-linked to the 
nucleic acid sequence to be expressed. Within a recombinant expression vector, 
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"operably-linked" is intended to mean that the nucleotid^scSqbAi^e Ifaf fflft&i (s iffikldfo*' 
the regulatory sequence(s) in a manner that allows for expression of the nucleotide 
sequence (e.g., in an in vitro transcription/translation system or in a host cell when the 
vector is introduced into the host cell). 

5 The term "regulatory sequence" is intended to includes promoters, enhancers and 

other expression control elements (e.g., polyadenylation signals). Such regulatory 
sequences are described, for example, in Goeddel, Gene Expression Technology: 
Methods in Enzymology 185, Academic Press, San Diego, Calif. (1990). Regulatory 
sequences include those that direct constitutive expression of a nucleotide sequence in 

10 many types of host cell and those that direct expression of the nucleotide sequence only in 
certain host cells (e.g., tissue-specific regulatory sequences). It will be appreciated by 
those skilled in the art that the design of the expression vector can depend on such factors 
as the choice of the host cell to be transformed, the level of expression of protein desired, 
etc. The expression vectors of the invention can be introduced into host cells to thereby 

15 produce proteins or peptides, including fusion proteins or peptides, encoded by nucleic 
acids as described herein (e.g., NOVX proteins, mutant forms of NOVX proteins, fusion 
proteins, etc.). 

The recombinant expression vectors of the invention can be designed for expression 
of NOVX proteins in prokaryotic or eukaryotic cells. For example, NOVX proteins can be 

20 expressed in bacterial cells such as Escherichia coli, insect cells (using baculovirus 

expression vectors) yeast cells or mammalian cells. Suitable host cells are discussed further 
in Goeddel, Gene Expression Technology: Methods in Enzymology 185, Academic 
Press, San Diego, Calif. (1990). Alternatively, the recombinant expression vector can be 
transcribed and translated in vitro, for example using T7 promoter regulatory sequences 

25 and T7 polymerase. 

Expression of proteins in prokaryotes is most often carried out in Escherichia coli 
with vectors containing constitutive or inducible promoters directing the expression of 
either fusion or non-fusion proteins. Fusion vectors add a number of amino acids to a 
protein encoded therein, usually to the amino terminus of the recombinant protein. Such 

30 fusion vectors typically serve three purposes: (0 to increase expression of recombinant 
protein; (ii) to increase the solubility of the recombinant protein; and (Hi) to aid in the 
purification of the recombinant protein by acting as a ligand in affinity purification. Often, 
in fusion expression vectors, a proteolytic cleavage site is introduced at the junction of the 

63 



WO 03/029424 



PCT/US02/31373 



fusion moiety and the recombinant protein to enable sepsFafi'&rfdf tMS K&OHfflfnaTlf pfcftgifi 
from the fusion moiety subsequent to purification of the fusion protein. Such enzymes, and 
their cognate recognition sequences, include Factor Xa, thrombin and enterokinase. 
Typical fusion expression vectors include pGEX (Pharmacia Biotech Inc; Smith and 

5 Johnson, 1988. Gene 67: 31-40), pMAL (New England Biolabs, Beverly, Mass.) and 
pRTT5 (Pharmacia, Piscataway, NJ.) that fuse glutathione S-transferase (GST), maltose E 
binding protein, or protein A, respectively, to the target recombinant protein. 

Examples of suitable inducible non-fusion E. coli expression vectors include pTrc 
(Amrann et aL, (1988) Gene 69:301-315) and pET lid (Studier et aL, GENE EXPRESSION 

10 Technology: Methods in Enzymology 1 85, Academic Press, San Diego, Calif. (1990) 
60-89). 

One strategy to maximize recombinant protein expression in E. coli is to express the 
protein in a host bacteria with an impaired capacity to proteolytically cleave the 
recombinant protein. See, e.g., Gottesman, Gene Expression Technology: Methods in 

15 Enzymology 185, Academic Press, San Diego, Calif. (1990) 1 19-128. Another strategy is 
to alter the nucleic acid sequence of the nucleic acid to be inserted into an expression vector 
so that the individual codons for each amino acid are those preferentially utilized in E. coli 
{see, e.g., Wada, et aL, 1992. Nucl. Acids Res. 20: 2111-2118). Such alteration of nucleic 
acid sequences of the invention can be carried out by standard DNA synthesis techniques. 

20 In another embodiment, the NOVX expression vector is a yeast expression vector. 

Examples of vectors for expression in yeast Saccharomyces cerivisae include pYepSecl 
(Baldari, et aL, 1987. EMBO J. 6: 229-234), pMFa (Kurjan and Herskowitz, 1982. Cell 30: 
933-943), pJRY88 (Schultz et aL, 1987. Gene 54: 113-123), pYES2 (Invitrogen 
Corporation, San Diego, Calif.), and picZ (InVitrogen Corp, San Diego, Calif.). 

25 Alternatively, NOVX can be expressed in insect cells using baculovirus expression 

vectors. Baculovirus vectors available for expression of proteins in cultured insect cells 
{e.g., SF9 cells) include the pAc series (Smith, et aL, 1983. Mol. Cell. Biol. 3: 2156-2165) 
and the pVL series (Lucklow and Summers, 1989. Virology 170: 31-39). 

In yet another embodiment, a nucleic acid of the invention is expressed in 

30 mammalian cells using a mammalian expression vector. Examples of mammalian 

expression vectors include pCDM8 (Seed, 1987. Nature 329: 840) and pMT2PC (Kaufman, 
et aL, 1987. EMBO J. 6: 187 : 195). When used in mammalian cells, the expression vector's 
control functions are often provided by viral regulatory elements. For example, commonly 
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used promoters are derived from polyoma, adenovirus 2, cyt5nfe^aI6Vi?u5^ Snd siTraah~vinis* 
40. For other suitable expression systems for both prokaryotic and eukaryotic cells see, 
e.g., Chapters 16 and 17 of Sambrook, et al. f Molecular Cloning: A Laboratory 
MANUAL. 2nd ed., Cold Spring Harbor Laboratory, Cold Spring Harbor Laboratory Press, 

5 Cold Spring Harbor, N.Y., 1989. 

In another embodiment, the recombinant mammalian expression vector is capable 
of directing expression of the nucleic acid preferentially in a particular cell type (e.g., 
tissue-specific regulatory elements are used to express the nucleic acid). Tissue-specific 
regulatory elements are known in the art. Non-limiting examples of suitable tissue-specific 

10 promoters include the albumin promoter (liver-specific; Knkert, et al, 1987. Genes Dev. 1: 
268-277), lymphoid-specific promoters (Calame and Eaton, 1988. Adv. Immunol 43: 
235-275), in particular promoters of T cell receptors (Winoto and Baltimore, 1989. EMBO 
7. 8: 729-733) and immunoglobulins (Banerji, et al, 1983. Cell 33: 729-740; Queen and 
Baltimore, 1983. Cell 33: 741-748), neuron-specific promoters (e.g., the neurofilament 

15 promoter; Byrne and Ruddle, 1989. Proc. Natl Acad. Sci. USA 86: 5473-5477), 

pancreas-specific promoters (Edlund, et al, 1985. Science 230: 912-916), and mammary 
gland-specific promoters (e.g., milk whey promoter; U.S. Pat. No. 4,873,316 and European 
Application Publication No. 264,166). Developmentally-regulated promoters are also 
encompassed, e.g., the murine hox promoters (Kessel and Gruss, 1990. Science 249: 

20 374-379) and the a-fetoprotein promoter (Campes and Tilghman, 1989. Genes Dev. 3: 
537-546). 

The invention further provides a recombinant expression vector comprising a DNA 
molecule of the invention cloned into the expression vector in an antisense orientation. 
That is, the DNA molecule is operatively-linked to a regulatory sequence in a manner that 

25 allows for expression (by transcription of the DNA molecule) of an RNA molecule that is 
antisense to NOVX mRNA. Regulatory sequences operatively linked to a nucleic acid 
cloned in the antisense orientation can be chosen that direct the continuous expression of 
the antisense RNA molecule in a variety of cell types, for instance viral promoters and/or 
enhancers, or regulatory sequences can be chosen that direct constitutive, tissue specific or 

30 cell type specific expression of antisense RNA. The antisense expression vector can be in 
the form of a recombinant plasmid, phagemid or attenuated virus in which antisense nucleic 
acids are produced under the control of a high efficiency regulatory region, the activity of 
which can be determined by the cell type into which the vector is introduced. For a 
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discussion of the regulation of gene expression using antfsenSe genes^e^T-* ' WeinfrauB r , 
et al t "Antisense RNA as a molecular tool for genetic analysis," Reviews-Trends in 
Genetics, Vol. 1(1) 1986. 

Another aspect of the invention pertains to host cells into which a recombinant 

5 expression vector of the invention has been introduced. The terms "host cell" and 

"recombinant host cell" are used interchangeably herein. It is understood that such terms 
refer not only to the particular subject cell but also to the progeny or potential progeny of 
such a cell. Because certain modifications may occur in succeeding generations due to 
either mutation or environmental influences, such progeny may not, in fact, be identical to 

10 the parent cell, but are still included within the scope of the term as used herein. 

A host cell can be any prokaryotic or eukaryotic cell. For example, NOVX protein 
can be expressed in bacterial cells such as E. coli, insect cells, yeast or mammalian cells 
(such as Chinese hamster ovary cells (CHO) or COS cells). Other suitable host cells are 
known to those skilled in the art. 

15 Vector DNA can be introduced into prokaryotic or eukaryotic cells via conventional 

transformation or transfection techniques. As used herein, the terms "transformation" and 
"transfection" are intended to refer to a variety of art-recognized techniques for introducing 
foreign nucleic acid (e.g., DNA) into a host cell, including calcium phosphate or calcium 
chloride co-precipitation, DEAE-dextran-mediated transfection, lipofection, or 

20 electroporation. Suitable methods for transforming or transfecting host cells can be found 
in Sambrook, et al (Molecular Cloning: A LABORATORY Manual. 2nd ed., Cold 
Spring Harbor Laboratory, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, 
N.Y., 1989), and other laboratory manuals. 

For stable transfection of mammalian cells, it is known that, depending upon the 

25 expression vector and transfection technique used, only a small fraction of cells may 
integrate the foreign DNA into their genome. In order to identify and select these 
integrants, a gene that encodes a selectable marker (e.g., resistance to antibiotics) is 
generally introduced into the host cells along with the gene of interest. Various selectable 
markers include those that confer resistance to drugs, such as G418, hygromycin and 

30 methotrexate. Nucleic acid encoding a selectable marker can be introduced into a host cell 
on the same vector as that encoding NOVX or can be introduced on a separate vector. 
Cells stably transfected with the introduced nucleic acid can be identified by drug selection 
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(e.g., cells that have incorporated the selectable marker gfen^Hfsu^iYerwTTile ffi'e otfier 
cells die). 

A host cell of the invention, such as a prokaryotic or eukaryotic host cell in culture, 
can be used to produce (i.e., express) NOVX protein. Accordingly, the invention further 
5 provides methods for producing NOVX protein using the host cells of the invention. In one 
embodiment, the method comprises culturing the host cell of invention (into which a 
recombinant expression vector encoding NOVX protein has been introduced) in a suitable 
medium such that NOVX protein is produced. In another embodiment, the method further 
comprises isolating NOVX protein from the medium or the host cell. 

10 Transgenic NOVX Animals 

The host cells of the invention can also be used to produce non-human transgenic 
animals. For example, in one embodiment, a host cell of the invention is a fertilized oocyte 
or an embryonic stem cell into which NOVX protein-coding sequences have been 
introduced. Such host cells can then be used to create non-human transgenic animals in 

15 which exogenous NOVX sequences have been introduced into their genome or 

homologous recombinant animals in which endogenous NOVX sequences have been 
altered. Such animals are useful for studying the function and/or activity of NOVX protein 
and for identifying and/or evaluating modulators of NOVX protein activity. As used 
herein, a "transgenic animal" is a non-human animal, preferably a mammal, more 

20 preferably a rodent such as a rat or mouse, in which one or more of the cells of the animal 
includes a transgene. Other examples of transgenic animals include non-human primates, 
sheep, dogs, cows, goats, chickens, amphibians, etc. A transgene is exogenous DNA that is 
integrated into the genome of a cell from which a transgenic animal develops and that 
remains in the genome of the mature animal, thereby directing the expression of an 

25 encoded gene product in one or more cell types or tissues of the transgenic animal. As used 
herein, a "homologous recombinant animal" is a non-human animal, preferably a mammal, 
more preferably a mouse, in which an endogenous NOVX gene has been altered by 
homologous recombination between the endogenous gene and an exogenous DNA 
molecule introduced into a cell of the animal, e.g., an embryonic cell of the animal, prior to 

30 development of the animal. 

A transgenic animal of the invention can be created by introducing 
NOVX-encoding nucleic acid into the male pronuclei of a fertilized oocyte (e.g., by 
microinjection, retroviral infection) and allowing the oocyte to develop in a pseudopregnant 
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female foster animal. The human NOVX cDNA sequenc'esT?.^, any mVof SEQTDDT 
NO:2n-l, wherein n is an integer between 1 and 124, can be introduced as a transgene into 
the genome of a non-human animal. Alternatively, a non-human homologue of the human 
NOVX gene, such as a mouse NOVX gene, can be isolated based on hybridization to the 
5 human NOVX cDNA (described further supra) and used as a transgene. Intronic 

sequences and polyadenylation signals can also be included in the transgene to increase the 
efficiency of expression of the transgene. A tissue-specific regulatory sequence(s) can be 
operably-linked to the NOVX transgene to direct expression of NOVX protein to particular 
cells. Methods for generating transgenic animals via embryo manipulation and 

10 microinjection, particularly animals such as mice, have become conventional in the art and 
are described, for example, in U.S. Patent Nos. 4,736,866; 4,870,009; and 4,873,191; and 
Hogan, 1986. In: Manipulating the Mouse Embryo, Cold Spring Harbor Laboratory 
Press, Cold Spring Harbor, N.Y. Similar methods are used for production of other 
transgenic animals. A transgenic founder animal can be identified based upon the presence 

15 of the NOVX transgene in its genome and/or expression of NOVX mRNA in tissues or 
cells of the animals. A transgenic founder animal can then be used to breed additional 
animals carrying the transgene. Moreover, transgenic animals carrying a 
transgene-encoding NOVX protein can further be bred to other transgenic animals carrying 
other transgenes. 

20 To create a homologous recombinant animal, a vector is prepared which contains at 

least a portion of a NOVX gene into which a deletion, addition or substitution has been 
introduced to thereby alter, e.g., functionally disrupt, the NOVX gene. The NOVX gene 
can be a human gene (e.g., the cDNA of any one of SEQ ID NO:2n-l, wherein n is an 
integer between 1 and 124), but more preferably, is a non-human homologue of a human 

25 NOVX gene. For example, a mouse homologue of human NOVX gene of SEQ ID 
NO:2n-l, wherein n is an integer between 1 and 124, can be used to construct a 
homologous recombination vector suitable for altering an endogenous NOVX gene in the 
mouse genome. In one embodiment, the vector is designed such that, upon homologous 
recombination, the endogenous NOVX gene is functionally disrupted (le., no longer 

30 encodes a functional protein; also referred to as a "knock out" vector). 

Alternatively, the vector can be designed such that, upon homologous 
recombination, the endogenous NOVX gene is mutated or otherwise altered but still 
encodes functional protein {e.g., the upstream regulatory region can be altered to thereby 

68 



WO 03/029424 



PCT/US02/31373 



alter the expression of the endogenous NOVX protein). In tfiehomologoius recomblna^ * 
vector, the altered portion of the NOVX gene is flanked at its 5 - and 3 r -termini by 
additional nucleic acid of the NOVX gene to allow for homologous recombination to occur 
between the exogenous NOVX gene carried by the vector and an endogenous NOVX gene 

5 in an embryonic stem cell. The additional flanking NOVX nucleic acid is of sufficient 
length for successful homologous recombination with the endogenous gene. Typically, 
several kilobases of flanking DNA (both at the 5 - and 3'-termini) are included in the 
vector. See, e.g., Thomas, et al., 1987. Cell 51: 503 for a description of homologous 
recombination vectors. The vector is ten introduced into an embryonic stem cell line (e.g., 

10 by electroporation) and cells in which the introduced NOVX gene has 

homologously-recombined with the endogenous NOVX gene are selected. See, e.g., Li, et 
ah, 1992. Cell 69: 915. 

The selected cells are then injected into a blastocyst of an animal (e.g., a mouse) to 
form aggregation chimeras. See, e.g., Bradley, 1987. In: Teratocarcinomas and 

15 Embryonic Stem Cells: A Practical Approach, Robertson, ed. IRL, Oxford, pp. 

113-152. A chimeric embryo can then be implanted into a suitable pseudopregnant female 
foster animal and the embryo brought to term. Progeny harboring the 
homologously-recombined DNA in their germ cells can be used to breed animals in which 
all cells of the animal contain the homologously-recombined DNA by germline 

20 transmission of the transgene. Methods for constructing homologous recombination 

vectors and homologous recombinant animals are described further in Bradley, 1991. Curr. 
Opin. Biotechnol. 2: 823-829; PCT International Publication Nos.: WO 90/11354; WO 
91/01 140; WO 92/0968; and WO 93/04169. 

In another embodiment, transgenic non-humans animals can be produced that 

25 contain selected systems that allow for regulated expression of the transgene. One example 
of such a system is the cre/loxP recombinase system of bacteriophage PI . For a description 
of the cre/loxP recombinase system, See, e.g., Lakso, et al, 1992. Proc. Natl. Acad. Sci. 
USA 89: 6232-6236. Another example of a recombinase system is the FLP recombinase 
system of Sacclxaromyces cerevisiae. See, O'Gorman, et ah, 1991. Science 251:1351-1355. 

30 If a cre/loxP recombinase system is used to regulate expression of the transgene, animals 
containing transgenes encoding both the Cre recombinase and a selected protein are 
required. Such animals can be provided through the construction of "double" transgenic 
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animals, e.g., by mating two transgenic animals, one containing a trahs^ene^ricocUng a" 
selected protein and the other containing a transgene encoding a recombinase. 

Clones of the non-human transgenic animals described herein can also be produced 
according to the methods described in Wilmut, et al, 1997. Nature 385: 810-813. In brief, 

5 a cell {e.g., a somatic cell) from the transgenic animal can be isolated and induced to exit 
the growth cycle and enter Go phase. The quiescent cell can then be fused, e.g., through the 
use of electrical pulses, to an enucleated oocyte from an animal of the same species from 
which the quiescent cell is isolated. The reconstructed oocyte is then cultured such that it 
develops to morula or blastocyte and then transferred to pseudopregnant female foster 

10 animal. The offspring borne of this female foster animal will be a clone of the animal from 
which the cell (e.g., the somatic cell) is isolated. 

Pharmaceutical Compositions 

The NOVX nucleic acid molecules, NOVX proteins, and anti-NOVX antibodies 
(also referred to herein as "active compounds") of the invention, and derivatives, fragments, 

15 analogs and homologs thereof, can be incorporated into pharmaceutical compositions 
suitable for administration. Such compositions typically comprise the nucleic acid 
molecule, protein, or antibody and a pharmaceutical^ acceptable carrier. As used herein, 
"pharmaceutical acceptable carrier" is intended to include any and all solvents, dispersion 
media, coatings, antibacterial and antifungal agents, isotonic and absorption delaying 

20 agents, and the like, compatible with pharmaceutical administration. Suitable carriers are 
described in the most recent edition of Remington's Pharmaceutical Sciences, a standard 
reference text in the field, which is incorporated herein by reference. Preferred examples of 
such carriers or diluents include, but are not limited to, water, saline, finger's solutions, 
dextrose solution, and 5% human serum albumin. Liposomes and non-aqueous vehicles 

25 such as fixed oils may also be used. The use of such media and agents for 

pharmaceutically active substances is well known in the art. Except insofar as any 
conventional media or agent is incompatible with the active compound, use thereof in the 
compositions is contemplated. Supplementary active compounds can also be incorporated 
into the compositions. 

30 A pharmaceutical composition of the invention is formulated to be compatible with 

its intended route of administration. Examples of routes of administration include 
parenteral, e.g., intravenous, intradermal, subcutaneous, oral (e.g., inhalation), transdermal 
(i.e., topical), transmucosal, and rectal administration. Solutions or suspensions used for 
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parenteral, intradermal, or subcutaneous application can SicIudS fhe^llowin^'compone^^ ' 
a sterile diluent such as water for injection, saline solution, fixed oils, polyethylene glycols, 
glycerine, propylene glycol or other synthetic solvents; antibacterial agents such as benzyl 
alcohol or methyl parabens; antioxidants such as ascorbic acid or sodium bisulfite; 

5 chelating agents such as ethylenediaminetetraacetic acid (EDTA); buffers such as acetates, 
citrates or phosphates, and agents for the adjustment of tonicity such as sodium chloride or 
dextrose. The pH can be adjusted with acids or bases, such as hydrochloric acid or sodium 
hydroxide. The parenteral preparation can be enclosed in ampoules, disposable syringes or 
multiple dose vials made of glass or plastic. 

10 Pharmaceutical compositions suitable for injectable use include sterile aqueous 

solutions (where water soluble) or dispersions and sterile powders for the extemporaneous 
preparation of sterile injectable solutions or dispersion. For intravenous administration, 
suitable carriers include physiological saline, bacteriostatic water, Cremophor EL™ (BASF, 
Parsippany, NJ.) or phosphate buffered saline (PBS). In all cases, the composition must be 

15 sterile and should be fluid to the extent that easy syringeability exists. It must be stable 
under the conditions of manufacture and storage and must be preserved against the 
contaminating action of microorganisms such as bacteria and fungi. The carrier can be a 
solvent or dispersion medium containing, for example, water, ethanol, polyol (for example, 
glycerol, propylene glycol, and liquid polyethylene glycol, and the like), and suitable 

20 mixtures thereof. The proper fluidity can be maintained, for example, by the use of a 
coating such as lecithin, by the maintenance of the required particle size in the case of 
dispersion and by the use of surfactants. Prevention of the action of microorganisms can be 
achieved by various antibacterial and antifungal agents, for example, parabens, 
chlorobutanol, phenol, ascorbic acid, thimerosal, and the like. In many cases, it will be 

25 preferable to include isotonic agents, for example, sugars, polyalcohols such as manitol, 
sorbitol, sodium chloride in the composition. Prolonged absorption of the injectable 
compositions can be brought about by including in the composition an agent which delays 
absorption, for example, aluminum monostearate and gelatin. 

Sterile injectable solutions can be prepared by incorporating the active compound 

30 (e.g., a NOVX protein or anti-NOVX antibody) in the required amount in an appropriate 
solvent with one or a combination of ingredients enumerated above, as required, followed 
by filtered sterilization. Generally, dispersions are prepared by incorporating the active 
compound into a sterile vehicle that contains a basic dispersion medium and the required 
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other ingredients from those enumerated above. In the case'of steril^^^erS^or&ie 4 
preparation of sterile injectable solutions, methods of preparation are vacuum drying and 
freeze-drying that yields a powder of the active ingredient plus any additional desired 
ingredient from a previously sterile-filtered solution thereof. 

5 Oral compositions generally include an inert diluent or an edible carrier. They can 

be enclosed in gelatin capsules or compressed into tablets. For the purpose of oral 
therapeutic administration, the active compound can be incorporated with excipients and 
used in the form of tablets, troches, or capsules. Oral compositions can also be prepared 
using a fluid carrier for use as a mouthwash, wherein the compound in the fluid carrier is 

10 applied orally and swished and expectorated or swallowed. Pharmaceutically compatible 
binding agents, and/or adjuvant materials can be included as part of the composition. The 
tablets, pills, capsules, troches and the like can contain any of the following ingredients, or 
compounds of a similar nature: a binder such as microcrystalline cellulose, gum tragacanth 
or gelatin; an excipient such as starch or lactose, a disintegrating agent such as alginic acid, 

15 Primogel, or corn starch; a lubricant such as magnesium stearate or Sterotes; a glidant such 
as colloidal silicon dioxide; a sweetening agent such as sucrose or saccharin; or a flavoring 
agent such as peppermint, methyl salicylate, Or orange flavoring. 

For administration by inhalation, the compounds are delivered in the form of an 
aerosol spray from pressured container or dispenser which contains a suitable propellant, 

20 e.g., a gas such as carbon dioxide, or a nebulizer. 

Systemic administration can also be by transmucosal or transdermal means. For 
transmucosal or transdermal administration, penetrants appropriate to the barrier to be 
permeated are used in the formulation. Such penetrants are generally known in the art, and 
include, for example, for transmucosal administration, detergents, bile salts, and fusidic 

25 acid derivatives. Transmucosal administration can be accomplished through the use of 
nasal sprays or suppositories. For transdermal administration, the active compounds are 
formulated into ointments, salves, gels, or creams as generally known in the art. 

The compounds can also be prepared in the form of suppositories (e.g., with 
conventional suppository bases such as cocoa butter and other glycerides) or retention 

30 enemas for rectal delivery. 

In one embodiment, the active compounds are prepared with carriers that will 
protect the compound against rapid elimination from the body, such as a controlled release 
formulation, including implants and microencapsulated delivery systems. Biodegradable, 
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biocompatible polymers can be used, such as ethylene viriyra*cetaterp67yanh~ydriffesr 
polyglycolic acid, collagen, polyorthoesters, and polylactic acid. Methods for preparation 
of such formulations will be apparent to those skilled in the art. The materials can also be 
obtained commercially from Alza Corporation and Nova Pharmaceuticals, Inc. Liposomal 
5 suspensions (including liposomes targeted to infected cells with monoclonal antibodies to 
viral antigens) can also be used as pharmaceutical acceptable carriers. These can be 
prepared according to methods known to those skilled in the art, for example, as described 
in U.S. Patent No. 4,522,81 1. 

It is especially advantageous to formulate oral or parenteral compositions in dosage 
10 unit form for ease of administration and uniformity of dosage. Dosage unit form as used 
herein refers to physically discrete units suited as unitary dosages for the subject to be 
treated; each unit containing a predetermined quantity of active compound calculated to 
produce the desired therapeutic effect in association with the required pharmaceutical 
carrier. The specification for the dosage unit forms of the invention are dictated by and 
15 directly dependent on the unique characteristics of the active compound and the particular 
therapeutic effect to be achieved, and the limitations inherent in the art of compounding 
such an active compound for the treatment of individuals. 

The nucleic acid molecules of the invention can be inserted into vectors and used as 
gene therapy vectors. Gene therapy vectors can be delivered to a subject by, for example, 
20 intravenous injection, local administration (see, e.g., U.S. Patent No. 5,328,470) or by 
stereotactic injection (see, e.g., Chen, etal, 1994. Proc. Natl. Acad. Sci. USA 91: 
3054-3057). The pharmaceutical preparation of the gene therapy vector can include the 
gene therapy vector in an acceptable diluent, or can comprise a slow release matrix in 
which the gene delivery vehicle is imbedded. Alternatively, where the complete gene 
25 delivery vector can be produced intact from recombinant cells, e.g., retroviral vectors, the 
pharmaceutical preparation can include one or more cells that produce the gene delivery 
system. 

The pharmaceutical compositions can be included in a container, pack, or dispenser 
together with instructions for administration. 

30 Screening and Detection Methods 

The isolated nucleic acid molecules of the invention can be used to express NOVX 
protein (e.g., via a recombinant expression vector in a host cell in gene therapy 
applications), to detect NOVX mRNA (e.g., in a biological sample) or a genetic lesion in a 



WO 03/029424 



PCT/US02/31373 



NOVX gene, and to modulate NOVX activity, as describedTurfher^fow: lh ,, adaG^^6^^, r fhe , 
NOVX proteins can be used to screen drugs or compounds that modulate the NOVX 
protein activity or expression as well as to treat disorders characterized by insufficient or 
excessive production of NOVX protein or production of NOVX protein forms that have 

5 decreased or aberrant activity compared to NOVX wild-type protein (e.g. ; diabetes 
(regulates insulin release); obesity (binds and transport lipids); metabolic disturbances 
associated with obesity, the metabolic syndrome X as well as anorexia and wasting 
disorders associated with chronic diseases and various cancers, and infectious 
disease(possesses anti-microbial activity) and the various dyslipidemias. In addition, the 

10 anti-NOVX antibodies of the invention can be used to detect and isolate NOVX proteins 
and modulate NOVX activity. In yet a further aspect, the invention can be used in methods 
to influence appetite, absorption of nutrients and the disposition of metabolic substrates in 
both a positive and negative fashion. 

The invention further pertains to novel agents identified by the screening assays 

15 described herein and uses thereof for treatments as described, supra. 

Screening Assays 

The invention provides a method (also referred to herein as a "screening assay") for 
identifying modulators, i.e. 9 candidate or test compounds or agents (e.g., peptides, 
peptidomimetics, small molecules or other drugs) that bind to NOVX proteins or have a 
20 stimulatory or inhibitory effect on, e.g., NOVX protein expression or NOVX protein 
activity. The invention also includes compounds identified in the screening assays 
described herein. 

In one embodiment, the invention provides assays for screening candidate or test 
compounds which bind to or modulate the activity of the membrane-bound form of a 

25 NOVX protein or polypeptide or biologically-active portion thereof. The test compounds 
of the invention can be obtained using any of the numerous approaches in combinatorial 
library methods known in the art, including: biological libraries; spatially addressable 
parallel solid phase or solution phase libraries; synthetic library methods requiring 
deconvolution; the "one-bead one-compound" library method; and synthetic library 

30 methods using affinity chromatography selection. The biological library approach is 
limited to peptide libraries, while the other four approaches are applicable to peptide, 
non-peptide oligomer or small molecule libraries of compounds. See, e.g., Lam, 1997. 
Anticancer Drug Design 12: 145. 
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A "small molecule" as used herein, is meant to reFerfb a composi^oTtliatTias f 
molecular weight of less than about 5 kD and most preferably less than about 4 kD. Small 
molecules can be, e.g., nucleic acids, peptides, polypeptides, peptidomimetics, 
carbohydrates, lipids or other organic or inorganic molecules. Libraries of chemical and/or 

5 biological mixtures, such as fungal, bacterial, or algal extracts, are known in the art and can 
be screened with any of the assays of the invention. 

Examples of methods for the synthesis of molecular libraries can be found in the 
art, for example in: DeWitt, et al, 1993. Proc. Natl Acad. Sci. U.S.A. 90: 6909; Erb, et al, 
1994. Proc. Natl Acad. Sci. U.S.A. 91: 11422; Zuckermann, et al, 1994. J. Med. Chem. 37: 

10 2678; Cho, et al, 1993. Science 261: 1303; Carrell, et al, 1994. Angew. Chem. Int. Ed. 
Engl 33: 2059; Carell, et al, 1994. Angew. Chem. Int. Ed. Engl 33: 2061; and Gallop, et 
al, 1994. J. Med. Chem. 37: 1233. 

Libraries of compounds may be presented in solution {e.g., Houghten, 1992. 
Biotechniques 13: 412-421), or on beads (Lam, 1991. Nature 354: 82-84), on chips (Fodor, 

15 1993. Nature 364: 555-556), bacteria (Ladner, U.S. Patent No. 5,223,409), spores (Ladner, 
U.S. Patent 5,233,409), plasmids (Cull, et al, 1992. Proc. Natl Acad. Sci. USA 89: 
1865-1869) or on phage (Scott and Smith, 1990. Science 249: 386-390; Devlin, 1990. 
Science 249: 404-406; Cwirla, et al, 1990. Proc. Natl Acad. Set U.SA. 87: 6378-6382; 
Felici, 1991. J. Mol Biol 222: 301-310; Ladner, U.S. Patent No. 5,233,409.). 

20 In one embodiment, an assay is a cell-based assay in which a cell which expresses a 

membrane-bound foim of NOVX protein, or a biologically-active portion thereof, on the 
cell surface is contacted with a test compound and the ability of the test compound to bind 
to a NOVX protein determined. The cell, for example, can of mammalian origin or a yeast 
cell. Determining the ability of the test compound to bind to the NOVX protein can be 

25 accomplished, for example, by coupling the test compound with a radioisotope or 
enzymatic label such that binding of the test compound to the NOVX protein or 
biologically-active portion thereof can be determined by detecting the labeled compound in 
a complex. For example, test compounds can be labeled with 125 1, 35 S, 14 C, or 3 H, either 
directly or indirectly, and the radioisotope detected by direct counting of radioemission or 

30 by scintillation counting. Alternatively, test compounds can be enzymatically-labeled with, 
for example, horseradish peroxidase, alkaline phosphatase, or luciferase, and the enzymatic 
label detected by determination of conversion of an appropriate substrate to product. In 
one embodiment, the assay comprises contacting a cell which expresses a membrane-bound 
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form of NOVX protein, or a biologically-active portion thereof , u oh the ceff suiface ^itK' d a* ,r 
known compound which binds NOVX to form an assay mixture, contacting the assay 
mixture with a test compound, and determining the ability of the test compound to interact 
with a NOVX protein, wherein determining the ability of the test compound to interact with 
5 a NOVX protein comprises determining the ability of the test compound to preferentially 
bind to NOVX protein or a biologically-active portion thereof as compared to the known 
compound. 

In another embodiment, an assay is a cell-based assay comprising contacting a cell 
expressing a membrane-bound form of NOVX protein, or a biologically-active portion 

10 thereof, on the cell surface with a test compound and determining the ability of the test 
compound to modulate (e.g., stimulate or inhibit) the activity of the NOVX protein or 
biologically-active portion thereof. Determining the ability of the test compound to 
modulate the activity of NOVX or a biologically-active portion thereof can be 
accomplished, for example, by determining the ability of the NOVX protein to bind to or 

15 interact with a NOVX target molecule. As used herein, a "target molecule" is a molecule 
with which a NOVX protein binds or interacts in nature, for example, a molecule on the 
surface of a cell which expresses a NOVX interacting protein, a molecule on the surface of 
a second cell, a molecule in the extracellular milieu, a molecule associated with the internal 
surface of a cell membrane or a cytoplasmic molecule. A NOVX target molecule can be a 

20 non-NOVX molecule or a NOVX protein or polypeptide of the invention. In one 

embodiment, a NOVX target molecule is a component of a signal transduction pathway 
that facilitates transduction of an extracellular signal (e.g. a signal generated by binding of 
a compound to a membrane-bound NOVX molecule) through the cell membrane and into 
the cell. The target, for example, can be a second intercellular protein that has catalytic 

25 activity or a protein that facilitates the association of downstream signaling molecules with 
NOVX. 

Determining the ability of the NOVX protein to bind to or interact with a NOVX 
target molecule can be accomplished by one of the methods described above for 
determining direct binding. In one embodiment, determining the ability of the NOVX 
30 protein to bind to or interact with a NOVX target molecule can be accomplished by 
determining the activity of the target molecule. For example, the activity of the target 
molecule can be determined by detecting induction of a cellular second messenger of the 
target (i.e. intracellular Ca 2+ , diacylglycerol, BP3, etc.), detecting catalytic/enzymatic 
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activity of the target an appropriate substrate, detecting tKe indtfctiorf <Sf a^ep1) , he^geng ,,, 
(comprising a NOVX-responsive regulatory element operatively linked to a nucleic acid 
encoding a detectable marker, e.g., luciferase), or detecting a cellular response, for 
example, cell survival, cellular differentiation, or cell proliferation. 

5 In yet another embodiment, an assay of the invention is a cell-free assay comprising 

contacting a NOVX protein or biologically-active portion thereof with a test compound and 
determining the ability of the test compound to bind to the NOVX protein or 
biologically-active portion thereof. Binding of the test compound to the NOVX protein can 
be determined either directly or indirectly as described above. In one such embodiment, 

10 the assay comprises contacting the NOVX protein or biologically-active portion thereof 
with a known compound which binds NOVX to form an assay mixture, contacting the 
assay mixture with a test compound, and determining the ability of the test compound to 
interact with a NOVX protein, wherein determining the ability of the test compound to 
interact with a NOVX protein comprises determining the ability of the test compound to 

1 5 preferentially bind to NOVX or biologically-active portion thereof as compared to the 
known compound. 

In still another embodiment, an assay is a cell-free assay comprising contacting 
NOVX protein or biologically-active portion thereof with a test compound and determining 
the ability of the test compound to modulate {e.g. stimulate or inhibit) the activity of the 

20 NOVX protein or biologically-active portion thereof. Determining the ability of the test 
compound to modulate the activity of NOVX can be accomplished, for example, by 
determining the ability of the NOVX protein to bind to a NOVX target molecule by one of 
the methods described above for determining direct binding. In an alternative embodiment, 
determining the ability of the test compound to modulate the activity of NOVX protein can 

25 be accomplished by determining the ability of the NOVX protein further modulate a 
NOVX target molecule. For example, the catalytic/enzymatic activity of the target 
molecule on an appropriate substrate can be determined as described, supra. 

In yet another embodiment, the cell-free assay comprises contacting the NOVX 
protein or biologically-active portion thereof with a known compound which binds NOVX 

30 protein to form an assay mixture, contacting the assay mixture with a test compound, and 
determining the ability of the test compound to interact with a NOVX protein, wherein 
determining the ability of the test compound to interact with a NOVX protein comprises 
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determining the ability of the NOVX protein to preferential^ bM ftfdf MABtfat^HSf 
activity of a NOVX target molecule. 

The cell-free assays of the invention are amenable to use of both the soluble form or 
the membrane-bound form of NOVX protein. In the case of cell-free assays comprising the 
5 membrane-bound form of NOVX protein, it may be desirable to utilize a solubilizing agent 
such that the membrane-bound form of NOVX protein is maintained in solution. Examples 
of such solubilizing agents include non-ionic detergents such as n-octylglucoside, 
n-dodecylglucoside, n-dodecylmaltoside, octanoyl-N-methylglucamide, 
decanoyl-N-methylglucamide, Triton® X-100, Triton® X-114, Thesit®, 

10 Isotridecypoly (ethylene glycol ether) n , N-dodecyl--N,N-dimethyl-3-ammonio-l-propane 
sulfonate, 3-(3-cholamidopropyl) dimethylamminiol-l -propane sulfonate (CHAPS), or 
3-(3-cholamidopropyl)dimethylamminiol-2-hydroxy-l-propane sulfonate (CHAPSO). 

In more than one embodiment of the above assay methods of the invention, it may 
be desirable to immobilize either NOVX protein or its target molecule to facilitate 

15 separation of complexed from uncomplexed forms of one or both of the proteins, as well as 
to accommodate automation of the assay. Binding of a test compound to NOVX protein, or 
interaction of NOVX protein with a target molecule in the presence and absence of a 
candidate compound, can be accomplished in any vessel suitable for containing the 
reactants. Examples of such vessels include microtiter plates, test tubes, and 

20 micro-centrifuge tubes. In one embodiment, a fusion protein can be provided that adds a 
domain that allows one or both of the proteins to be bound to a matrix. For example, 
GST-NO VX fusion proteins or GST-target fusion proteins can be adsorbed onto 
glutathione sepharose beads (Sigma Chemical, St. Louis, MO) or glutathione derivatized 
microtiter plates, that are then combined with the test compound or the test compound and 

25 either the non-adsorbed target protein or NOVX protein, and the mixture is incubated under 
conditions conducive to complex formation (e.g., at physiological conditions for salt and 
pH). Following incubation, the beads or microtiter plate wells are washed to remove any 
unbound components, the matrix immobilized in the case of beads, complex determined 
either directly or indirectly, for example, as described, supra. Alternatively, the complexes 

30 can be dissociated from the matrix, and the level of NOVX protein binding or activity 
determined using standard techniques. 

Other techniques for immobilizing proteins on matrices can also be used in the 
screening assays of the invention. For example, either the NOVX protein or its target 
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molecule can be immobilized utilizing conjugation of bio\iri"ana streptaviainT 'Biofinylated 
NOVX protein or target molecules can be prepared from biotin-NHS 
(N-hydroxy-succinimide) using techniques well-known within the art {e.g., biotinylation 
kit, Pierce Chemicals, Rockford, HI.), and immobilized in the wells of streptavidin-coated 
5 96 well plates (Pierce Chemical). Alternatively, antibodies reactive with NOVX protein or 
target molecules, but which do not interfere with binding of the NOVX protein to its target 
molecule, can be derivatized to the wells of the plate, and unbound target or NOVX protein 
trapped in the wells by antibody conjugation. Methods for detecting such complexes, in 
addition to those described above for the GST-immobilized complexes, include 

10 immunodetection of complexes using antibodies reactive with the NOVX protein or target 
molecule, as well as enzyme-linked assays that rely on detecting an enzymatic activity 
associated with the NOVX protein or target molecule. 

In another embodiment, modulators of NOVX protein expression are identified in a 
method wherein a cell is contacted with a candidate compound and the expression of 

15 NOVX mRNA or protein in the cell is determined. The level of expression of NOVX 
mRNA or protein in the presence of the candidate compound is compared to the level of 
expression of NOVX mRNA or protein in the absence of the candidate compound. The 
candidate compound can then be identified as a modulator of NOVX mRNA or protein 
expression based upon this comparison. For example, when expression of NOVX mRNA 

20 or protein is greater {i.e., statistically significantly greater) in the presence of the candidate 
compound than in its absence, the candidate compound is identified as a stimulator of 
NOVX mRNA or protein expression. Alternatively, when expression of NOVX mRNA or 
protein is less (statistically significantly less) in the presence of the candidate compound 
than in its absence, the candidate compound is identified as an inhibitor of NOVX mRNA 

25 or protein expression. The level of NOVX mRNA or protein expression in the cells can be 
determined by methods described herein for detecting NOVX mRNA or protein. 

In yet another aspect of the invention, the NOVX proteins can be used as "bait 
proteins" in a two-hybrid assay or three hybrid assay (see, e.g., U.S. Patent No. 5,283,317; 
Zervos, et al, 1993. Cell 72: 223-232; Madura, et al, 1993. /. Biol. Chem. 268: 

30 12046-12054; Bartel, et al, 1993. Biotechniques 14: 920-924; Iwabuchi, et al, 1993. 

Oncogene 8: 1693-1696; and Brent WO 94/10300), to identify other proteins that bind to or 
interact with NOVX ("NOVX-binding proteins" or "NOVX-bp") and modulate NOVX 
activity. Such NOVX-binding proteins are also involved in the propagation of signals by 
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the NOVX proteins as, for example, upstream or downstream elements oTtfie NC5V}£ " 
pathway. 

The two-hybrid system is based on the modular nature of most transcription factors, 
which consist of separable DNA-binding and activation domains. Briefly, the assay utilizes 

5 two different DNA constructs. In one construct, the gene that codes for NOVX is fused to a 
gene encoding the DNA binding domain of a known transcription factor (e.g., GAL-4). In 
the other construct, a DNA sequence, from a library of DNA sequences, that encodes an 
unidentified protein ("prey" or "sample") is fused to a gene that codes for the activation 
domain of the known transcription factor. If the "bait" and the "prey" proteins are able to 

10 interact, in vivo, forming a NOVX-dependent complex, the DNA-binding and activation 
domains of the transcription factor are brought into close proximity. This proximity allows 
transcription of a reporter gene (e.g., LacZ) that is operably linked to a transcriptional 
regulatory site responsive to the transcription factor. Expression of the reporter gene can 
be detected and cell colonies containing the functional transcription factor can be isolated 

15 and used to obtain the cloned gene that encodes the protein which interacts with NOVX. 

The invention further pertains to novel agents identified by the aforementioned 
screening assays and uses thereof for treatments as described herein. 

Detection Assays 

Portions or fragments of the cDNA sequences identified herein (and the 
20 corresponding complete gene sequences) can be used in numerous ways as polynucleotide 
reagents. By way of example, and not of limitation, these sequences can be used to: (i) 
map their respective genes on a chromosome; and, thus, locate gene regions associated with 
genetic disease; (fi) identify an individual from a minute biological sample (tissue typing); 
and (Hi) aid in forensic identification of a biological sample. Some of these applications 
25 are described in the subsections, below. 

Chromosome Mapping 

Once the sequence (or a portion of the sequence) of a gene has been isolated, this 
sequence can be used to map the location of the gene on a chromosome. This process is 
called chromosome mapping. Accordingly, portions or fragments of the NOVX sequences 
30 of SEQ ID NO:2rc-l, wherein n is an integer between 1 and 124, or fragments or derivatives 
thereof, can be used to map the location of the NOVX genes, respectively, on a 
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chromosome. The mapping of the NOVX sequences to cfirtTrhosdmBS Iff M iB/poffaift AM 
step in correlating these sequences with genes associated with disease. 

Briefly, NOVX genes can be mapped to chromosomes by preparing PCR primers 
(preferably 15-25 bp in length) from the NOVX sequences. Computer analysis of the 

5 NOVX, sequences can be used to rapidly select primers that do not span more than one 
exon in the genomic DNA, thus complicating the amplification process. These primers can 
then be used for PCR screening of somatic cell hybrids containing individual human 
chromosomes. Only those hybrids containing the human gene corresponding to the NOVX 
sequences will yield an amplified fragment. 

10 Somatic cell hybrids are prepared by fusing somatic cells from different mammals 

(e.g., human and mouse cells). As hybrids of human and mouse cells grow and divide, they 
gradually lose human chromosomes in random order, but retain the mouse chromosomes. 
By using media in which mouse cells cannot grow, because they lack a particular enzyme, 
but in which human cells can, the one human chromosome that contains the gene encoding 

15 the needed enzyme will be retained. By using various media, panels of hybrid cell lines 
can be established. Each cell line in a panel contains either a single human chromosome or 
a small number of human chromosomes, and a full set of mouse chromosomes, allowing 
easy mapping of individual genes to specific human chromosomes. See, e.g., DEustachio, 
et ah, 1983. Science 220: 919-924. Somatic cell hybrids containing only fragments of 

20 human chromosomes can also be produced by using human chromosomes with 
translocations and deletions. 

PCR mapping of somatic cell hybrids is a rapid procedure for assigning a particular 
sequence to a particular chromosome. Three or more sequences can be assigned per day 
using a single thermal cycler. Using the NOVX sequences to design oligonucleotide 

25 primers, sub-localization can be achieved with panels of fragments from specific 
chromosomes. 

Fluorescence in situ hybridization (FISH) of a DNA sequence to a metaphase 
chromosomal spread can further be used to provide a precise chromosomal location in one 
step. Chromosome spreads can be made using cells whose division has been blocked in 
30 metaphase by a chemical like colcemid that disrupts the mitotic spindle. The chromosomes 
can be treated briefly with trypsin, and then stained with Giemsa. A pattern of light and 
dark bands develops on each chromosome, so that the chromosomes can be identified 
individually. The FISH technique can be used with a DNA sequence as short as 500 or 600 
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bases. However, clones larger than 1,000 bases have a higiSf iMlifttfdtf &f Wilding*!? r 
unique chromosomal location with sufficient signal intensity for simple detection. 
Preferably 1,000 bases, and more preferably 2,000 bases, will suffice to get good results at 
a reasonable amount of time. For a review of this technique, see, Verma, et al, Human 
5 Chromosomes: A Manual of Basic Techniques (Pergamon Press, New York 1988). 
Reagents for chromosome mapping can be used individually to mark a single 
chromosome or a single site on that chromosome, or panels of reagents can be used for 
marking multiple sites and/or multiple chromosomes. Reagents corresponding to 
noncoding regions of the genes actually are preferred for mapping purposes. Coding 

10 sequences are more likely to be conserved within gene families, thus increasing the chance 
of cross hybridizations during chromosomal mapping. 

Once a sequence has been mapped to a precise chromosomal location, the physical 
position of the sequence on the chromosome can be correlated with genetic map data. Such 
data are found, e.g., in McKusick, Mendelian Inheritance in Man, available on-line 

1 5 through Johns Hopkins University Welch Medical library). The relationship between 

genes and disease, mapped to the same chromosomal region, can then be identified through 
linkage analysis (co-inheritance of physically adjacent genes), described in, e.g., Egeland, 
et al, 1987. Nature, 325: 783-787. 

Moreover, differences in the DNA sequences between individuals affected and 

20 unaffected with a disease associated with the NOVX gene, can be determined. If a 

mutation is observed in some or all of the affected individuals but not in any unaffected 
individuals, then the mutation is likely to be the causative agent of the particular disease. 
Comparison of affected and unaffected individuals generally involves first looking for 
structural alterations in the chromosomes, such as deletions or translocations that are 

25 visible from chromosome spreads or detectable using PCR based on that DNA sequence. 
Ultimately, complete sequencing of genes from several individuals can be performed to 
confirm the presence of a mutation and to distinguish mutations from polymorphisms. 

Tissue Typing 

The NOVX sequences of the invention can also be used to identify individuals from 
30 minute biological samples. In this technique, an individual's genomic DNA is digested 
with one or more restriction enzymes, and probed on a Southern blot to yield unique bands 
for identification. The sequences of the invention are useful as additional DNA markers for 
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KELP ("restriction fragment length polymorphisms," described in U.S. Patent No. 
5,272,057). 

Furthermore, the sequences of the invention can be used to provide an alternative 
technique that determines the actual base-by-base DNA sequence of selected portions of an 
5 individual's genome. Thus, the NOVX sequences described herein can be used to prepare 
two PCR primers from the 5 - and S'-termini of the sequences. These primers can then be 
used to amplify an individual's DNA and subsequently sequence it. 

Panels of corresponding DNA sequences from individuals, prepared in this manner, 
can provide unique individual identifications, as each individual will have a unique set of 

10 such DNA sequences due to allelic differences. The sequences of the invention can be used 
to obtain such identification sequences from individuals and from tissue. The NOVX 
sequences of the invention uniquely represent portions of the human genome. Allelic 
variation occurs to some degree in the coding regions of these sequences, and to a greater 
degree in the noncoding regions. It is estimated that allelic variation between individual 

15 humans occurs with a frequency of about once per each 500 bases. Much of the allelic 
variation is due to single nucleotide polymorphisms (SNPs), which include restriction 
fragment length polymorphisms (RFLPs). 

Each of the sequences described herein can, to some degree, be used as a standard 
against which DNA from an individual can be compared for identification purposes. 

20 Because greater numbers of polymorphisms occur in the noncoding regions, fewer 
sequences are necessary to differentiate individuals. The noncoding sequences can 
comfortably provide positive individual identification with a panel of perhaps 10 to 1,000 
primers that each yield a noncoding amplified sequence of 100 bases. If coding sequences, 
such as those of SEQ ID NO:2n-l, wherein n is an integer between 1 and 124, are used, a 

25 more appropriate number of primers for positive individual identification would be 
500-2,000. 

Predictive Medicine 

The invention also pertains to the field of predictive medicine in which diagnostic 
assays, prognostic assays, pharmacogenomics, and monitoring clinical trials are used for 
30 prognostic (predictive) purposes to thereby treat an individual prophylactically. 

Accordingly, one aspect of the invention relates to diagnostic assays for determining 
NOVX protein and/or nucleic acid expression as well as NOVX activity, in the context of a 
biological sample (e.g., blood, serum, cells, tissue) to thereby determine whether an 
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individual is afflicted with a disease or disorder, or is at rislc of developing a disorder, 
associated with aberrant NOVX expression or activity. The disorders include Inetabolic 
disorders, diabetes, obesity, infectious disease, anorexia, cancer-associated cachexia, 
cancer, neurodegenerative disorders, Alzheimer's Disease, Parkinson's Disorder, immune 
5 disorders, and hematopoietic disorders, and the various dyslipidemias, metabolic 

disturbances associated with obesity, the metabolic syndrome X and wasting disorders 
associated with chronic diseases and various cancers. The invention also provides for 
prognostic (or predictive) assays for determining whether an individual is at risk of 
developing a disorder associated with NOVX protein, nucleic acid expression or activity. 

10 For example, mutations in a NOVX gene can be assayed in a biological sample. Such 

assays can be used for prognostic or predictive purpose to thereby prophylactically treat an 
individual prior to the onset of a disorder characterized by or associated with NOVX 
protein, nucleic acid expression, or biological activity. 

Another aspect of the invention provides methods for determining NOVX protein, 

15 nucleic acid expression or activity in an individual to thereby select appropriate therapeutic 
or prophylactic agents for that individual (referred to herein as "pharmacogenomics"). 
Pharmacogenomics allows for the selection of agents (e.g., drugs) for therapeutic or 
prophylactic treatment of an individual based on the genotype of the individual (e.g., the 
genotype of the individual examined to determine the ability of the individual to respond to 

20 a particular agent.) 

Yet another aspect of the invention pertains to monitoring the influence of agents 
(e.g., drugs, compounds) on the expression or activity of NOVX in clinical trials. 

These and other agents are described in further detail in the following sections. 

Diagnostic Assays 

25 An exemplary method for detecting the presence or absence of NOVX in a 

biological sample involves obtaining a biological sample from a test subject and contacting 
the biological sample with a compound or an agent capable of detecting NOVX protein or 
nucleic acid (e.g., mRNA, genomic DNA) that encodes NOVX protein such that the 
presence of NOVX is detected in the biological sample. An agent for detecting NOVX 

30 mRNA or genomic DNA is a labeled nucleic acid probe capable of hybridizing to NOVX 
mRNA or genomic DNA. The nucleic acid probe can be, for example, a full-length NOVX 
nucleic acid, such as the nucleic acid of SEQ ID NO:2n-l, wherein n is an integer between 
1 and 124, or a portion thereof, such as an oligonucleotide of at least 15, 30, 50, 100, 250 or 
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500 nucleotides in length and sufficient to specifically hybridize" under SErmgeht conditions * 
to NOVX mRNA or genomic DNA. Other suitable probes for use in the diagnostic assays 
of the invention are described herein. 

An agent for detecting NOVX protein is an antibody capable of binding to NOVX 

5 protein, preferably an antibody with a detectable label. Antibodies can be polyclonal, or 
more preferably, monoclonal. An intact antibody, or a fragment thereof (e.g., Fab or 
F(ab')2) can be used. The term "labeled", with regard to the probe or antibody, is intended 
to encompass direct labeling of the probe or antibody by coupling physically linking) 
a detectable substance to the probe or antibody, as well as indirect labeling of the probe or 

10 antibody by reactivity with another reagent that is directly labeled. Examples of indirect 
labeling include detection of a primary antibody using a fluorescently-labeled secondary 
antibody and end-labeling of a DNA probe with biotin such that it can be detected with 
fluoiescently-labeled streptavidin. The term "biological sample" is intended to include 
tissues, cells and biological fluids isolated from a subject, as well as tissues, cells and fluids 

15 present within a subject. That is, the detection method of the invention can be used to 
detect NOVX mRNA, protein, or genomic DNA in a biological sample in vitro as well as 
in vivo. For example, in vitro techniques for detection of NOVX mRNA include Northern 
hybridizations and in situ hybridizations. In vitro techniques for detection of NOVX 
protein include enzyme linked immunosorbent assays (ELJSAs), Western blots, 

20 immunoprecipitations, and immunofluorescence. In vitro techniques for detection of 
NOVX genomic DNA include Southern hybridizations. Furthermore, in vivo techniques 
for detection of NOVX protein include introducing into a subject a labeled anti-NOVX 
antibody. For example, the antibody can be labeled with a radioactive marker whose 
presence and location in a subject can be detected by standard imaging techniques. 

25 In one embodiment, the biological sample contains protein molecules from the test 

subject. Alternatively, the biological sample can contain mRNA molecules from the test 
subject or genomic DNA molecules from the test subject. A preferred biological sample is 
a peripheral blood leukocyte sample isolated by conventional means from a subject. 

In another embodiment, the methods further involve obtaining a control biological 

30 sample from a control subject, contacting the control sample with a compound or agent 
capable of detecting NOVX protein, mRNA, or genomic DNA, such that the presence of 
NOVX protein, mRNA or genomic DNA is detected in the biological sample, and 
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comparing the presence of NOVX protein, mRNA or genomic t^A^iiTtKe c:6ntnS sampfe 
with the presence of NOVX protein, mRNA or genomic DNA in the test sample. 

The invention also encompasses kits for detecting the presence of NOVX in a 
biological sample. For example, the kit can comprise: a labeled compound or agent 
5 capable of detecting NOVX protein or mRNA in a biological sample; means for 

determining the amount of NOVX in the sample; and means for comparing the amount of 
NOVX in the sample with a standard. The compound or agent can be packaged in a 
suitable container. The kit can further comprise instructions for using the kit to detect 
NOVX protein or nucleic acid. 

10 Prognostic Assays 

The diagnostic methods described herein can furthermore be utilized to identify 
subjects having or at risk of developing a disease or disorder associated with aberrant 
NOVX expression or activity. For example, the assays described herein, such as the 
preceding diagnostic assays or the following assays, can be utilized to identify a subject 

15 having or at risk of developing a disorder associated with NOVX protein, nucleic acid 
expression or activity. Alternatively, the prognostic assays can be utilized to identify a 
subject having or at risk for developing a disease or disorder. Thus, the invention provides 
a method for identifying a disease or disorder associated with aberrant NOVX expression 
or activity in which a test sample is obtained from a subject and NOVX protein or nucleic 

20 acid (e.g., mRNA, genomic DNA) is detected, whereiri'the presence of NOVX protein or 
nucleic acid is diagnostic for a subject having or at risk of developing a disease or disorder 
associated with aberrant NOVX expression or activity. As used herein, a "test sample" 
refers to a biological sample obtained from a subject of interest. For example, a test sample 
can be a biological fluid (e.g., serum), cell sample, or tissue. 

25 Furthermore, the prognostic assays described herein can be used to determine 

whether a subject can be administered an agent (e.g., an agonist, antagonist, 
peptidomimetic, protein, peptide, nucleic acid, small molecule, or other drug candidate) to 
treat a disease or disorder associated with aberrant NOVX expression or activity. For 
example, such methods can be used to determine whether a subject can be effectively 

30 treated with an agent for a disorder. Thus, the invention provides methods for determining 
whether a subject can be effectively treated with an agent for a disorder associated with 
aberrant NOVX expression or activity in which a test sample is obtained and NOVX 
protein or nucleic acid is detected (e.g., wherein the presence of NOVX protein or nucleic 
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acid is diagnostic for a subject that can be administered thfe agent to tifeaTaHisbrdef " 
associated with aberrant NOVX expression or activity). 

The methods of the invention can also be used to detect genetic lesions in a NOVX 
gene, thereby determining if a subject with the lesioned gene is at risk for a disorder 
5 characterized by aberrant cell proliferation and/or differentiation. In various embodiments, 
the methods include detecting, in a sample of cells from the subject, the presence or 
absence of a genetic lesion characterized by at least one of an alteration affecting the 
integrity of a gene encoding a NOVX-protein, or the misexpression of the NOVX gene. 
For example, such genetic lesions can be detected by ascertaining the existence of at least 

10 one of: (i) a deletion of one or more nucleotides from a NOVX gene; (ii) an addition of one 
or more nucleotides to a NOVX gene; (Hi) a substitution of one or more nucleotides of a 
NOVX gene, (zv) a chromosomal rearrangement of a NOVX gene; (v) an alteration in the 
level of a messenger RNA transcript of a NOVX gene, (vi) aberrant modification of a 
NOVX gene, such as of the methylation pattern of the genomic DNA, (vii) the presence of 

15 a non-wild-type splicing pattern of a messenger RNA transcript of a NOVX gene, (viii) a 
non-wild-type level of a NOVX protein, (ix) allelic loss of a NOVX gene, and (x) 
inappropriate post-translational modification of a NOVX protein. As described herein, 
there are a large number of assay techniques known in the art which can be used for 
detecting lesions in a NOVX gene. A preferred biological sample is a peripheral blood 

20 leukocyte sample isolated by conventional means from a subject. However, any biological 
sample containing nucleated cells may be used, including, for example, buccal mucosal 
cells. 

In certain embodiments, detection of the lesion involves the use of a probe/primer in 
a polymerase chain reaction (PCR) (see, e.g., U.S. Patent Nos. 4,683,195 and 4,683,202), 

25 such as anchor PCR or RACE PCR, or, alternatively, in a ligation chain reaction (LCR) 
(see, e.g., Landegran, et ah, 1988. Science 241: 1077-1080; and Nakazawa, et al, 1994. 
Proc. Natl. Acad. Sci. USA 91: 360-364), the latter of which can be particularly useful for 
detecting point mutations in the NOVX-gene (see, Abravaya, et al, 1995. Nucl. Acids Res. 
23: 675-682). This method can include the steps of collecting a sample of cells from a 

30 patient, isolating nucleic acid (e.g., genomic, mRNA or both) from the cells of the sample, 
contacting the nucleic acid sample with one or more primers that specifically hybridize to a 
NOVX gene under conditions such that hybridization and amplification of the NOVX gene 
(if present) occurs, and detecting the presence or absence of an amplification product, or 
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detecting the size of the amplification product and comp&irig the lengiTi io a cbnGbF 
sample. It is anticipated that PCR and/or LCR may be desirable to use as a preliminary 
amplification step in conjunction with any of the techniques used for detecting mutations 
described herein. 

5 Alternative amplification methods include: self sustained sequence replication (see, 

Guatelli, et al, 1990. Proc. Natl Acad. Sci. USA 87: 1874-1878), transcriptional 
amplification system (see, Kwoh, etal., 1989. Proc. Natl Acad. Sci. USA 86: 1173-1177); 
QP Replicase (see, Lizardi, et a/, 1988. BioTechnology 6: 1197), or any other nucleic acid 
amplification method, followed by the detection of the amplified molecules using 

10 techniques well known to those of skill in the art. These detection schemes are especially 
useful for the detection of nucleic acid molecules if such molecules are present in very low 
numbers. 

In an alternative embodiment, mutations in a NOVX gene from a sample cell can be 
identified by alterations in restriction enzyme cleavage patterns. For example, sample and 

15 control DNA is isolated, amplified (optionally), digested with one or more restriction 
endonucleases, and fragment length sizes are determined by gel electrophoresis and 
compared. Differences in fragment length sizes between sample and control DNA 
indicates mutations in the sample DNA. Moreover, the use of sequence specific ribozymes 
(see, e.g., U.S. Patent No. 5,493,531) can be used to score for the presence of specific 

20 mutations by development or loss of a ribozyme cleavage site. 

In other embodiments, genetic mutations in NOVX can be identified by hybridizing 
a sample and control nucleic acids, e.g., DNA or RNA, to high-density arrays containing 
hundreds or thousands of oligonucleotides probes. See, e.g., Cronin, et al, 1996. Human 
Mutation 7: 244-255; Kozal, et ah, 1996. Nat. Med. 2: 753-759. For example, genetic 

25 mutations in NOVX can be identified in two dimensional arrays containing light-generated 
DNA probes as described in Cronin, et al, supra. Briefly, a first hybridization array of 
probes can be used to scan through long stretches of DNA in a sample and control to 
identify base changes between the sequences by making linear arrays of sequential 
overlapping probes. This step allows the identification of point mutations. This is 

30 followed by a second hybridization array that allows the characterization of specific 
mutations by using smaller, specialized probe arrays complementary to all variants or 
mutations detected. Each mutation array is composed of parallel probe sets, one 
complementary to the wild-type gene and the other complementary to the mutant gene. 
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In yet another embodiment, any of a variety of sequencing reactions known in the 
art can be used to directly sequence the NOVX gene and detect mutations by comparing the 
sequence of the sample NOVX with the corresponding wild-type (control) sequence. 
Examples of sequencing reactions include those based on techniques developed by Maxim 
5 and Gilbert, 1977. Proc. Natl Acad. Scl USA 74: 560 or Sanger, 1977. Proc. Natl. Acad. 
ScL USA 74: 5463. It is also contemplated that any of a variety of automated sequencing 
procedures can be utilized when performing the diagnostic assays {see, e.g., Naeve, et al, 
1995. Biotechniques 19: 448), including sequencing by mass spectrometry (see, e.g., PCT 
International Publication No. WO 94/16101; Cohen, et al, 1996. Adv. Chromatography 36: 

10 127-162; and Griffin, et al, 1993. Appl. Biochem. Biotechnol 38: 147-159). 

Other methods for detecting mutations in the NOVX gene include methods in which 
protection from cleavage agents is used to detect mismatched bases in RNA/RNA or 
RNA/DNA heteroduplexes. See, e.g., Myers, et al, 1985. Science 230: 1242. In general, 
the art technique of "mismatch cleavage" starts by providing heteroduplexes of formed by 

15 hybridizing (labeled) RNA or DNA containing the wild-type NOVX sequence with 
potentially mutant RNA or DNA obtained from a tissue sample. The double-stranded 
duplexes are treated with an agent that cleaves single-stranded regions of the duplex such 
as which will exist due to basepair mismatches between the control and sample strands. 
For instance, RNA/DNA duplexes can be treated with RNase and DNA/DNA hybrids 

20 treated with Si nuclease to enzymatically digesting the mismatched regions. In other 
embodiments, either DNA/DNA or RNA/DNA duplexes can be treated with 
hydroxylamine or osmium tetroxide and with piperidine in order to digest mismatched 
regions. After digestion of the mismatched regions, the resulting material is then separated 
by size on denaturing polyacrylamide gels to determine the site of mutation. See, e.g., 

25 Cotton, et al, 1988. Proc. Natl Acad. Sci. USA 85: 4397; Saleeba, et al, 1992. Methods 
Enzymol. 217: 286-295. In an embodiment, the control DNA or RNA can be labeled for 
detection. 

In still another embodiment, the mismatch cleavage reaction employs one or more 
proteins that recognize mismatched base pairs in double-stranded DNA (so called "DNA 
30 mismatch repair" enzymes) in defined systems for detecting and mapping point mutations 
in NOVX cDNAs obtained from samples of cells. For example, the mutY enzyme of E. 
coli cleaves A at G/A mismatches and the thymidine DNA glycosylase from HeLa cells 
cleaves T at G/T mismatches. See, e.g., Hsu, et al, 1994. Carcinogenesis 15: 1657-1662. 
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According to an exemplary embodiment, a probe based on 'aT^VXseqiTence, e7g.~H " 
wild-type NOVX sequence, is hybridized to a cDNA or other DNA product from a test 
cell(s). The duplex is treated with a DNA mismatch repair enzyme, and the cleavage 
products, if any, can be detected from electrophoresis protocols or the like. See, e.g., U.S. 
5 Patent No. 5,459,039. 

In other embodiments, alterations in electrophoretic mobility will be used to 
identify mutations in NOVX genes. For example, single strand conformation 
polymorphism (SSCP) may be used to detect differences in electrophoretic mobility 
between mutant and wild type nucleic acids. See, e.g., Orita, et al, 1989. Proc. Natl. Acad. 

10 Sci. USA: 86: 2766; Cotton, 1993. Mutat. Res. 285: 125-144; Hayashi, 1992. Genet. Anal 
Tech. Appl. 9: 73-79. Single-stranded DNA fragments of sample and control NOVX 
nucleic acids will be denatured and allowed to renature. The secondary structure of 
single-stranded nucleic acids varies according to sequence, the resulting alteration in 
electrophoretic mobility enables the detection of even a single base change. The DNA 

15 fragments may be labeled or detected with labeled probes. The sensitivity of the assay may 
be enhanced by using RNA (rather than DNA), in which the secondary structure is more 
sensitive to a change in sequence. In one embodiment, the subject method utilizes 
heteroduplex analysis to separate double stranded heteroduplex molecules on the basis of 
changes in electrophoretic mobility. See, e.g., Keen, et al. y 1991. Trends Genet. 7: 5. 

20 In yet another embodiment, the movement of mutant or wild-type fragments in 

polyacrylamide gels containing a gradient of denaturant is assayed using denaturing 
gradient gel electrophoresis (DGGE). See, e.g., Myers, et al, 1985. Nature 313: 495. 
When DGGE is used as the method of analysis, DNA will be modified to insure that it does 
not completely denature, for example by adding a GC clamp of approximately 40 bp of 

25 high-melting GC-rich DNA by PCR. In a further embodiment, a temperature gradient is 
used in place of a denaturing gradient to identify differences in the mobility of control and 
sample DNA. See, e.g., Rosenbaum and Reissner, 1987. Biophys. Chenu 265: 12753. 

Examples of other techniques for detecting point mutations include, but are not 
limited to, selective oligonucleotide hybridization, selective amplification, or selective 

30 primer extension. For example, oligonucleotide primers may be prepared in which the 
known mutation is placed centrally and then hybridized to target DNA under conditions 
that permit hybridization only if a perfect match is found. See, e.g., Saiki, et al.y 1986. 
Nature 324: 163; Saiki, et al 9 1989. Proc. Natl Acad. Sci. USA 86: 6230. Such allele 
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specific oligonucleotides are hybridized to PCR amplifiedlf^tf^ 

different mutations when the oligonucleotides are attached to the hybridizing membrane 

and hybridized with labeled target DNA. 

Alternatively, allele specific amplification technology that depends on selective 

5 PCR amplification may be used in conjunction with the instant invention. Oligonucleotides 
used as primers for specific amplification may carry the mutation of interest in the center of 
the molecule (so that amplification depends on differential hybridization; see, e.g., Gibbs, 
et a/., 1989. Nucl. Acids Res. 17: 2437-2448) or at the extreme 3'-terminus of one primer 
where, under appropriate conditions, mismatch can prevent, or reduce polymerase 

10 extension (see, e.g., Prossner, 1993. Tibtech. 11: 238). In addition it may be desirable to 
introduce a novel restriction site in the region of the mutation to create cleavage-based 
detection. See, e.g., Gasparini, etal, 1992. Mol. Cell Probes 6: 1. It is anticipated that in 
certain embodiments amplification may also be performed using Taq ligase for 
amplification. See, e.g., Barany, 1991. Proc. Natl. Acad. Sci. USA 88: 189. In such cases, 

15 ligation will occur only if there is a perfect match at the 3-terminus of the 5' sequence, 
making it possible to detect the presence of a known mutation at a specific site by looking 
for the presence or absence of amplification. 

The methods described herein may be performed, for example, by utilizing 
pre-packaged diagnostic kits comprising at least one probe nucleic acid or antibody reagent 

20 described herein, which may be conveniently used, e.g., in clinical settings to diagnose 
patients exhibiting symptoms or family history of a disease or illness involving a NOVX 
gene. 

Furthermore, any cell type or tissue, preferably peripheral blood leukocytes, in 
which NOVX is expressed may be utilized in the prognostic assays described herein. 
25 However, any biological sample containing nucleated cells may be used, including, for 
example, buccal mucosal cells. 

Pharmacogenomics 

Agents, or modulators that have a stimulatory or inhibitory effect on NOVX activity 
[e.g., NOVX gene expression), as identified by a screening assay described herein can be 
30 administered to individuals to treat (prophylactically or therapeutically) disorders. The 
disorders include but are not limited to, e.g., those diseases, disorders and conditions listed 
above, and more particularly include those diseases, disorders, or conditions associated 
with homologs of a NOVX protein, such as those summarized in Table A. 
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In conjunction with such treatment, the pharmacopfionficfc (^erfttifc WdyWtht? 
relationship between an individual's genotype and that individual's response to a foreign 
compound or drug) of the individual may be considered. Differences in metabolism of 
therapeutics can lead to severe toxicity or therapeutic failure by altering the relation 

5 between dose and blood concentration of the pharmacologically active drug. Thus, the 
pharmacogenomics of the individual permits the selection of effective agents (e.g., drugs) 
for prophylactic or therapeutic treatments based on a consideration of the individuals 
genotype. Such pharmacogenomics can further be used to determine appropriate dosages 
and therapeutic regimens. Accordingly, the activity of NOVX protein, expression of 

10 NOVX nucleic acid, or mutation content of NOVX genes in an individual can be 

determined to thereby select appropriate agent(s) for therapeutic or prophylactic treatment 
of the individual. 

Pharmacogenomics deals with clinically significant hereditary variations in the 
response to drugs due to altered drug disposition and abnormal action in affected persons. 

15 See e.g., Eichelbaum, 1996. Clin. Exp. Pharmacol. Physiol, 23: 983-985; Iinder, 1997. 
Clin. Chem., 43: 254-266. In general, two types of pharmacogenetic conditions can be 
differentiated. Genetic conditions transmitted as a single factor altering the way drugs act 
on the body (altered drug action) or genetic conditions transmitted as single factors altering 
the way the body acts on drugs (altered drug metabolism). These pharmacogenetic 

20 conditions can occur either as rare defects or as polymorphisms. For example, 
glucose-6-phosphate dehydrogenase (G6PD) deficiency is a common inherited 
enzymopathy in which the main clinical complication is hemolysis after ingestion of 
oxidant drugs (anti-malarials, sulfonamides, analgesics, nitrofurans) and consumption of 
fava beans. 

25 As an illustrative embodiment, the activity of drug metabolizing enzymes is a major 

determinant of both the intensity and duration of drug action. The discovery of genetic 
polymorphisms of drug metabolizing enzymes (e.g., N-acetyltransferase 2 (NAT 2) and 
cytochrome pregnancy zone protein precursor enzymes CYP2D6 and CYP2C19) has 
provided an explanation as to why some patients do not obtain the expected drug effects or 

30 show exaggerated drug response and serious toxicity after taking the standard and safe dose 
of a drug. These polymorphisms are expressed in two phenotypes in the population, the 
extensive metabolizer (EM) and poor metabolizer (PM). The prevalence of PM is different 
among different populations. For example, the gene coding for CYP2D6 is highly 
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polymorphic and several mutations have been identified InM, tih^foMFaxfto^' 
absence of functional CYP2D6. Poor metabolizers of CYP2D6 and CYP2C19 quite 
frequently experience exaggerated drug response and side effects when they receive 
standard doses. If a metabolite is the active therapeutic moiety, PM show no therapeutic 

5 response, as demonstrated for the analgesic effect of codeine mediated by its 

CYP2D6-formed metabolite morphine. At the other extreme are the so called ultra-rapid 
metabolizers who do not respond to standard doses. Recently, the molecular basis of 
ultra-rapid metabolism has been identified to be due to CYP2D6 gene amplification. 

Thus, the activity of NOVX protein, expression of NOVX nucleic acid, or mutation 

10 content of NOVX genes in an individual can be determined to thereby select appropriate 
agent(s) for therapeutic or prophylactic treatment of the individual. In addition, 
pharmacogenetic studies can be used to apply genotyping of polymorphic alleles encoding 
drug-metabolizing enzymes to the identification of an individual's drug responsiveness 
phenotype. This knowledge, when applied to dosing or drug selection, can avoid adverse 

15 reactions or therapeutic failure and thus enhance therapeutic or prophylactic efficiency 
when treating a subject with a NOVX modulator, such as a modulator identified by one of 
the exemplary screening assays described herein. 

Monitoring of Effects During Clinical Trials 

Monitoring the influence of agents (e.g., drugs, compounds) on the expression or 
20 activity of NOVX (e.g. , the ability to modulate aberrant cell proliferation and/or 

differentiation) can be applied not only in basic drug screening, but also in clinical trials. 
For example, the effectiveness of an agent determined by a screening assay as described 
herein to increase NOVX gene expression, protein levels, or upregulate NOVX activity, 
can be monitored in clinical trails of subjects exhibiting decreased NOVX gene expression, 
25 protein levels, or downregulated NOVX activity. Alternatively, the effectiveness of an 
agent determined by a screening assay to decrease NOVX gene expression, protein levels, 
or downregulate NOVX activity, can be monitored in clinical trails of subjects exhibiting 
increased NOVX gene expression, protein levels, or upregulated NOVX activity. In such 
clinical trials, the expression or activity of NOVX and, preferably, other genes that have 
30 been implicated in, for example, a cellular proliferation or immune disorder can be used as 
a "read out" or markers of the immune responsiveness of a particular cell. 

By way of example, and not of limitation, genes, including NOVX, that are 
modulated in cells by treatment with an agent (e.g., compound, drug or small molecule) 
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that modulates NOVX activity (e.g., identified in a screeffiH^a8s4y'^^<2tfb6d fifcr&ri^ dbr 
be identified. Thus, to study the effect of agents on cellular proliferation disorders, for 
example, in a clinical trial, cells can be isolated and RNA prepared and analyzed for the 
levels of expression of NOVX and other genes implicated in the disorder. The levels of 
5 gene expression (i.e., a gene expression pattern) can be quantified by Northern blot analysis 
or RT-PCR, as described herein, or alternatively by measuring the amount of protein 
produced, by one of the methods as described herein, or by measuring the levels of activity 
of NOVX or other genes. In this manner, the gene expression pattern can serve as a 
marker, indicative of the physiological response of the cells to the agent. Accordingly, this 

10 response state may be determined before, and at various points during, treatment of the 
individual with the agent. 

In one embodiment, the invention provides a method for monitoring the 
effectiveness of treatment of a subject with an agent (e.g., an agonist, antagonist, protein, 
peptide, peptidomimetic, nucleic acid, small molecule, or other drug candidate identified by 

15 the screening assays described herein) comprising the steps of (0 obtaining a 

pre-administration sample from a subject prior to administration of the agent; (it) detecting 
the level of expression of a NOVX protein, mRNA, or genomic DNA in the 
preadministration sample; (Hi) obtaining one or more post-administration samples from the 
subject; (iv) detecting the level of expression or activity of the NOVX protein, mRNA, or 

20 genomic DNA in the post-administration samples; (v) comparing the level of expression or 
activity of the NOVX protein, mRNA, or genomic DNA in the pre-administration sample 
with the NOVX protein, mRNA, or genomic DNA in the post administration sample or 
samples; and (vi) altering the administration of the agent to the subject accordingly. For 
example, increased administration of the agent may be desirable to increase the expression 

25 or activity of NOVX to higher levels than detected, to increase the effectiveness of the 
agent. Alternatively, decreased administration of the agent may be desirable to decrease 
expression or activity of NOVX to lower levels than detected, i.e., to decrease the 
effectiveness of the agent. 

Methods of Treatment 

30 The invention provides for both prophylactic and therapeutic methods of treating a 

subject at risk of (or susceptible to) a disorder or having a disorder associated with aberrant 
NOVX expression or activity. The disorders include but are not limited to, e.g., those 
diseases, disorders and conditions listed above, and more particularly include those 
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diseases, disorders, or conditions associated with homolcKgs^f & jHUferi, 8? 

those summarized in Table A. 

These methods of treatment will be discussed more fully, below. 

Diseases and Disorders 

5 Diseases and disorders that are characterized by increased (relative to a subject not 

suffering from the disease or disorder) levels or biological activity may be treated with 
Therapeutics that antagonize (i.e., reduce or inhibit) activity. Therapeutics that antagonize 
activity may be administered in a therapeutic or prophylactic manner. Therapeutics that 
may be utilized include, but are not limited to: (i) an aforementioned peptide, or analogs, 

10 derivatives, fragments or homologs thereof; (ii) antibodies to an aforementioned peptide; 
(Hi) nucleic acids encoding an aforementioned peptide; (iv) administration of antisense 
nucleic acid and nucleic acids that are "dysfunctional" (i.e., due to a heterologous insertion 
within the coding sequences of coding sequences to an aforementioned peptide) that are 
utilized to "knockout" endogenous function of an aforementioned peptide by homologous 

15 recombination (see, e.g., Capecchi, 1989. Science 244: 1288-1292); or (v) modulators ( Le. 9 
inhibitors, agonists and antagonists, including additional peptide mimetic of the invention 
or antibodies specific to a peptide of the invention) that alter the interaction between an 
aforementioned peptide and its binding partner. 

Diseases and disorders that are characterized by decreased (relative to a subject not 

20 suffering from the disease or disorder) levels or biological activity may be treated with 
Therapeutics that increase (ie. y are agonists to) activity. Therapeutics that upregulate 
activity may be administered in a therapeutic or prophylactic manner. Therapeutics that 
may be utilized include, but are not limited to, an aforementioned peptide, or analogs, 
derivatives, fragments or homologs thereof; or an agonist that increases bioavailability. 

25 Increased or decreased levels can be readily detected by quantifying peptide and/or 

RNA, by obtaining a patient tissue sample (e.g., from biopsy tissue) and assaying it in vitro 
for RNA or peptide levels, structure and/or activity of the expressed peptides (or mRNAs 
of an aforementioned peptide). Methods that are well-known within the art include, but are 
not limited to, immunoassays (e.g., by Western blot analysis, immunoprecipitation 

30 followed by sodium dodecyl sulfate (SDS) polyacrylamide gel electrophoresis, 

immunocytochemistry, etc.) and/or hybridization assays to detect expression of mRNAs 
(e.g., Northern assays, dot blots, in situ hybridization, and the like). 
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Prophylactic Methods 

In one aspect, the invention provides a method for preventing, in a subject, a disease 
or condition associated with an aberrant NOVX expression or activity, by administering to 
the subject an agent that modulates NOVX expression or at least one NOVX activity. 

5 Subjects at risk for a disease that is caused or contributed to by aberrant NOVX expression 
or activity can be identified by, for example, any or a combination of diagnostic or 
prognostic assays as described herein. Administration of a prophylactic agent can occur 
prior to the manifestation of symptoms characteristic of the NOVX aberrancy, such that a 
disease or disorder is prevented or, alternatively, delayed in its progression. Depending 

10 upon the type of NOVX aberrancy, for example, a NOVX agonist or NOVX antagonist 
agent can be used for treating the subject. The appropriate agent can be determined based 
on screening assays described herein. The prophylactic methods of the invention are 
further discussed in the following subsections. 

Therapeutic Methods 

1 5 Another aspect of the invention pertains to methods of modulating NOVX 

expression or activity for therapeutic purposes. The modulatory method of the invention 
involves contacting a cell with an agent that modulates one or more of the activities of 
NOVX protein activity associated with the cell. An agent that modulates NOVX protein 
activity can be an agent as described herein, such as a nucleic acid or a protein, a 

20 naturally-occurring cognate ligand of a NOVX protein, a peptide, a NOVX 

peptidomimetic, or other small molecule. In one embodiment, the agent stimulates one or 
more NOVX protein activity. Examples of such stimulatory agents include active NOVX 
protein and a nucleic acid molecule encoding NOVX that has been introduced into the cell. 
In another embodiment, the agent inhibits one or more NOVX protein activity. Examples 

25 of such inhibitory agents include antisense NOVX nucleic acid molecules and anti-NOVX 
antibodies. These modulatory methods can be performed in vitro (e.g., by culturing the cell 
with the agent) or, alternatively, in vivo (e.g., by administering the agent to a subject). As 
such, the invention provides methods of treating an individual afflicted with a disease or 
disorder characterized by aberrant expression or activity of a NOVX protein or nucleic acid 

30 molecule. In one embodiment, the method involves administering an agent (e.g., an agent 
identified by a screening assay described herein), or combination of agents that modulates 
(e.g., up-regulates or down-regulates) NOVX expression or activity. In another 
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embodiment, the method involves administering a NOV^pVdtSiti (^TlblSe^idlfctBaMe 
as therapy to compensate for reduced or abeirant NOVX expression or activity. 

Stimulation of NOVX activity is desirable in stations in which NOVX is 
abnormally downregulated and/or in which increased NOVX activity is likely to have a 
5 beneficial effect. One example of such a situation is where a subject has a disorder 

characterized by aberrant cell proliferation and/or differentiation {e.g., cancer or immune 
associated disorders). Another example of such a situation is where the subject has a 
gestational disease (e.g., preclampsia). 

Determination of the Biological Effect of the Therapeutic 

10 In various embodiments of the invention, suitable in vitro or in vivo assays are 

performed to determine the effect of a specific Therapeutic and whether its administration 
is indicated for treatment of the affected tissue. 

In various specific embodiments, in vitro assays may be performed with 
representative cells of the type(s) involved in the patient's disorder, to determine if a given 

15 Therapeutic exerts the desired effect upon the cell type(s). Compounds for use in therapy 
may be tested in suitable animal model systems including, but not limited to rats, mice, 
chicken, cows, monkeys, rabbits, and the like, prior to testing in human subjects. Similarly, 
for in vivo testing, any of the animal model system known in the art may be used prior to 
administration to human subjects. 

20 Prophylactic and Therapeutic Uses of the Compositions of the Invention 

The NOVX nucleic acids and proteins of the invention are useful in potential 
prophylactic and therapeutic applications implicated in a variety of disorders. The 
disorders include but are not limited to, e.g., those diseases, disorders and conditions listed 
above, and more particularly include those diseases, disorders, or conditions associated 
25 with homologs of a NOVX protein, such as those summarized in Table A. 

As an example, a cDNA encoding the NOVX protein of the invention may be 
useful in gene therapy, and the protein may be useful when administered to a subject in 
need thereof. By way of non-limiting example, the compositions of the invention will have 
efficacy for treatment of patients suffering from diseases, disorders, conditions and the like, 
30 including but not limited to those listed herein. 

Both the novel nucleic acid encoding the NOVX protein, and the NOVX protein of 
the invention, or fragments thereof, may also be useful in diagnostic applications, wherein 
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the presence or amount of the nucleic acid or the protein t §r£1o n bS aSg^SSffd^A fiffifier use 
could be as an anti-bacterial molecule (i.e., some peptides have been found to possess 
anti-bacterial properties). These materials are further useful in the generation of antibodies, 
which immunospecifically-bind to the novel substances of the invention for use in 
5 therapeutic or diagnostic methods. 

The invention will be further described in the following examples, which do not 
limit the scope of the invention described in the claims. 

EXAMPLES 

Example A: Polynucleotide and Polypeptide Sequences, and Homology Data 
10 Example 1. 

The NOV1 clone was analyzed, and the nucleotide and encoded polypeptide 
sequences are shown in Table 1 A. 



Table 1A. NOV1 Sequence Analysis 




SEQIDNO: 1 


16189 bp | 


NOVla, 
CG106764-01 
DNA Sequence 


ATGTTGAAGTTCAAATATGGAGCGCGGAATCCTTTGGATGCTGGTGCTGCTGAACCCATTGCCAGCCG 
GGCCTCCAGGCTGAATCTGTTCTTCCAGGGGAAACCACCCTTTATGACTCAACAGCAGATGTCTCCTC 
TTTCCCGAGAAGGGATATTAGATGCCCTCTTTGTTC 

AAGATTAAGCACGTGAGCAACTTTGTCCGGAAG0X5TTCCGACACCATAGCTGAGTTACAGGAGCTCCA 
GCCTTCGGCAAAGGACTTCGAAGTCAGAAGTCTTGTAGGTTGTGGTCACTTTGCTGAAGTGCAGGTGG 
TAAGAGAGAAAGCAACCGGGGACATCTATGCTATGAAAGTGATGAAGAAGAAGGCTTTATTGGCCCAG 
GAGCAGGTTTCATTTTTTGAGGAAGAGCGGAACATATTATCTCGAAGCACAAGCCCGTGGATCCCCCA 
ATTACAGTATGCCTTTCAGGACAAAAATCACCTTTATCTGGTGATGGAATATCAGCCTGGAGGGGACT 
TGCTGTCACTTTTGAATAGATATGAGGACCAGTTAGATGAAAACCTGATACAGTTTTACCTAGCTGAG 
CTGATTTTGGCTGTTCACAGCGTTGATC 

TC TCGTTGACCGCACAGGAC ACATCAAGC TGGTGGATTTTGGATCTGCCGCGAAAATGAATTCAAACA 
AGGTGAATGCCAAACTCCCGATTGGGACCCCAGATTACATGGCTCCTGAAGTGCTGACTGTGATGAAC 
GGGGATGGAAAAGGCACCTACGGCCTGGACTGTGACTGGTGGTCAGTGGGCGTGATTGCCTATGAGAT 
GATTTATGGGAGATCCCCCTTCGCAGAGGGAACCTCTGCCAGAACCTTCAATAACATTATGAATTTCC 
AGCGGTTTTTGAAATTTCCAGATGACCCCAAAGTGAGCAGTGACTTTCTTGATCTGATTCAAAGCTTG 
TTGTGCGGCCAGAAAGAGAGACTGAAGTTTGAAGGTCTTTGCTGCCATCCTTTCTTCTCTAAAATTGA 
CTGGAACAACATTCGTAACGCTCCTCCCCCCTTCGTTCCCACCCTCAAGTCTGACGATGACACCTCCA 
ATTTTGATGAACCAGAGAAGAATTCG TGGGTTTC ATCCTCTCCGTGCCAGC TGAGCCCCTC AGGCTTC 
TCGGGTGAAGAACTGCCGTTTGTGGGGTTTTCGTACAGCAAGGCACTGGGGATTCTTGGTAGATCTGA 
GTCTGTTGTGTCGGGTCTGGACTCCCCTGCCAAGACTAGCTCCATGGAAAAGAAACTTCTCATCAAAA 
G CAAAGAGCT AC AAGACTC TC AGG ACAAGTGTC AGAAGATGG AGC^ 

AGAGTGTCAGAGGTGGAGGCTGTGCTTAGTCAGAAGGAGGTGGAGCTGAAGGCCTCTGAGACTCAGAG 
ATCCCTCCTGGAGCAGGACCTTGCTACCTACATCACAGAATGCAGTAGCTTAAAGCGAAGTTTGGAGC 
AAGCACGGATGGAGGTGTCCCAGGAGGATGACAAAGCACTGCAGCTTCTCCATGATATCAGAGAGCAG 
AGCCGGAAGCTCCAAGAAATCAAAGAGCAGGAGTACCAGGCTCAAGTGGAAGAAATGAGGTTGATGAT 
GAATCAGTTGGAAGAGGATCTTGTCTCAGCAAGAAGACGGAGTGATCTCTACGAATCTGAGCTGAGAG 
AGTCTCGGCTTGCTGCTGAAGAATTCAAGCGGAAAGCGACAGAATGTCAGCATAAACTGTTGAAGGCT 
AAGG ATC AGGGG AAGC C TG AAGTGGGAG AATATGCGAAACTG GAG AAG ATCAATGC TGAGC AG CAGCT 
CAAAATTC AGGAGCTCCAAG AG AAAC TGGAGAAGGC TGTAAAAG C CAGCACGG AGGCCAC C GAG C TGC 
TGCAGAATATCCGCCAGGCAAAGGAGCGAGCCGAGAGGGAGCTGGAGAAGCTGCAGAACCGAGAGGAT 
TCTTCTGAAGGCATCAGAAAGAAGCTGGTGGAAGCTGAGGAACGCCGCCATTCTCTGGAGAACAAGGT 
AAAGAGACTAGAGACCATGGAGCGTAGAGAAAACAGACTGAAGGATGACATCCAGACAAAATCCCAAC 
AGATCCAGCAGATGGCTGATAAAATTCTGGAGCTCGAAGAGAAACATCGGGAGGCCCAAGTCTCAGCC 
CAG C ACC TAGAAGTGC ACC TG AAACAG AAAG AGC AGCACTATG AGGAAAAGATTAAAGTATTGG AC AA 
TCAGATAAAGAAAGACCTGGCTGACAAGGAGACACTGGAGAACATGATGCAGAGACACGAGGAGGAGG 
CCCATCAGAAGGGCAAAATTCTCAG^GAACAGAAGGCGATGATCA^ 

TCCCTGGAACAGAGGATTGTGGAACTGTCTGAAGCCAATAAACTTGCAGCAAATAGCAGTCTTTTTAC 
CCAAAGGAACATGAAGGCC CAAGAAGAGATGATTTCTGAACTCAGGC AACAGAAATTTTACC TGGAGA 
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CACAGGCTGGGAAGTTOGAGGC 

GACCACAGTGACAAGAATCGGCTGCTGGAACTGGAGACAAGATTGCC5GGAGGTGAGTCTAGAGCACGA 
GGAGCAGAAACTGGAGCTCAAGCGCCAGCTCACAGAGCTACAGCTCTCCCTGCAGGAGCGCGAGTCAC 
AGTTGACAGCCCTGC^VGGCTGCACGGGCGGCCCTGGAGAGCCAGCTTCGCCAGGCGAAGACAGAGCTG 
GAAGAGACCACAGCAGAAGCTGAAGAGGAGATCCAGGCACTCACGGCACATAGAGATGAAATCCAGCG 
CAAATTTGATGCTCTTCGTAACAGCTGTACTGTGATCACAGACCTGGAGGAGCAGCTAAACCAGCTGA 
CCGAGGACAACGCTGAACTCAACAACCAAAACTTCTACTTGTCCAAACAACTCGATGAGGCTTCTGGC 
GCCAACGACGAGATTGTACAACTGCGAAGTGAAGTGGACCATCTCCGCCGGGAGATCACGGAACGAGA 
G ATGC AGC TT AC C AGCCAG AAG CAAACG ATGG AGGC TCTG AAGAC CACGTGC ACC ATG C TGGAGG AAC 
AGGTCATGGATTTGGAGGCCCTAAACGATGAGCTGCTAGAAAAAGAGCGGCAGTGGGAGGCCTGGAGG 
AGCGTCCTGGGTGATG AGAAATCC C AGTTTG AGTGTCGGGTTCG AGAGC TGCAGAGGATGC TGGACAC 
CGAGAAAC^GAGCAGGGCGAGAGCCGATCAGCGGATCACCGAGTCTCGCCAGGTGGTGGAGCTGGCAG 
TGAAGGAGCACAAGGCTGAGATTCTCGCTCTGCAGCAGGCTCTCAAAGAGCAGAAGCTGAAGGCCGAG 
AGCCTCTCTG ACAAGC TCAATG ACCTGGAGAAGAAGCATGC TATGCTTG AAATGAATGCCCGAAGC TT 
ACAGCAGAAGCTGGAGACTGAACGAGAGCTCAAACAGAGGCTTCTGGAAGAGCAAGCCAAATTACAGC 
AGCAGATGGACCTGCAGAAAAATCACATTTTCCGTCTGACTCAAGGACTGCAAGAAGCTCTAGATCGG 
GCTGATCTACTGAAGACAGAAAGAAGTGACTTGGAGTATCAGCTGGAAAACATTCAGGTGCTCTATTC 
TCATGAAAAGGTGAAAATGGAAGGCACTATTTCTCAACAAACCAAACTCATTGATTTTCTGCAAGCCA 
AAATGGACC AACCTGC TAAAAAG AAAAAGGTG CC TCTGCAGTAC AATGAGCTGAAGCTGG CCCTGGAG 
AAGGAGAAAGCTCGCTGTGCAGAGCTAGAGGAAGCCCTTCAGAAGACCCGCATCGAGCTCCGGTCCGC 
CCGGGAGGAAGCTGCCCACCGCAAAGCAACGGACCACCCACACCCATCCACGCCAGCCACCGCGAGGC 
AGCAGATCGCCATGTCTGCCATCGTGCGGTCGCCAGAGCACCAGCCCAGTGCCATGAGCCTGCTGGCC 
CCGCCATCCAGCCGCAGAAAGGAGTCTTCAACTCCAGAGGAATTTAGTCGGCGTCTTAAGGAACGCAT 
GCACCACAATATTCCTCACCGATTCAACGTAGGACTGAACATGCGAGCCACAAAGTGTGCTGTGTGTC 
TGGATACCGTGCACTTTGGACGCCAGGCATCCAAATGTCTAGAATGTCAGGTGATGTGTCACCCCAAG 
TGCTCCACGTGCTTGCCAGCCACCTGCGGCTTGC CTGCTGAATATGC C ACACACTTCACCGAGGCCTT 
CTGCCGTGACAAAATGAACTCCCCAGGTCTCCAGACCAAGGAGCCCAGCAGCAGCTTGCACCTGGAAG 
GG TGGATG AAGG TGCCC AGG AATAAC AAACG AGG AC AGCAAGGC TGGG AC AGGAAG TAC AT TG T C C TG 
GAGGGATCAAAAGTCCTCATTTATGACAATGAAGCCAGAGAAGCTGGACAGAGGCCGGTGGAAGAATT 
TGAGCTGTGCCTTCCCGACGGGGATGTATCTATTGATGGTGCCGTTGGTGCTTCCGAACTCGCAAATA 
CAGCCAAAGCAGATGTCCCATACATACTGAAGATGGAATCTCACCCGCACACCACCTGCTGGCCCGGG 
AG AAC CC TC TAC TTGC TAGCTC CC AG C TTCCC TG ACAAAC AG C G C TGGG TCACCGC C TTAGAATCAGT 
TGTCGCAGGTGGGAGAGTTTCTAGGGAAAAAGCAGAAGCTGATGCTAAACTGCTTGGAAACTCCCTGC 
TGAAACTGGAAGGTGATGACCGTC TAGACATGAAC TGCACGCTGCCC TTCAGTGACC AGGTAGTGTTG 
GTGGGC ACCGAGGAAGGGCTCTACG CCCTG AATGTCTTGAAAAAC TC C C TAAC CC ATGTCCCAGGAAT 
TGGAGK^GTCTTCCAAATTTATATTATCAAGGACCTGGAGAAGCTACTCATGATAGCAGGTGAAGAGC 
GGGCACTGTGTCTTGTGGACGTGAAGAAAGTGAAACAGTCCCTGGCCCAGTCCCACCTGCCTGCCCAG 
CCCGACATCTCACCCAACATTTTTGAAGCTGTCAAGGGCTGCCACTTGTTTGGGGCAGGCAAGATT 
GAACGGGCTCTGCATCTGTGCAGCCATGCCCAGCAAAGTCGTCATTCTCCGCTACAACGAAAACCTCA 
GCAAATACTGCATCCGGAAAGAGATAGAGACCTCAGAGCCCTGCAGCTGTATCCACTTCACCAATTAC 
AGTATCCTCATTGX3AACCAATAAATTCTACGAAATCGACATGAAGCAGTACACGCTCGAGGAATTCCT 
GGAT AAG AATG ACCATTCC TTGGCAC C TGC TGTG TT TGC CGC C TC T TC CAACAGC T TC C C TG TC TC AA 
TCGTGCAGGTGAACAGCGCAGGGCAGCGAGAGGAGTACTTGCTGTGTTTCCACGAATTTGGAGTGTTC 
GTGGATTCTTACGGAAGACGTAGCCGCACAGACGATCTCAAGTGGAGTCGCTTACCTTTGGCCTTTGC 
CTACAGAGAACCCTATCTGTTTGTGACCCACTTCAACTCACTCGAAGTAATTGAGATCCAGGCACGCT 
CCTCAGCAGGGACCCCTGCCCGAGCGTACCTGGACATCCCGAACCCGCGCTACCTGGGCCCTGCCATT 
TCCTCAGGAGCGATTTACTTGGCGTCCTCATACCAGGATAAATTAAGGGTCATTTGCTGCAAGGGAAA 
CCTCGTGAAGGAGTCCGGCACTGAACACCACCGGGGCCCGTCCACCTCCCGCAGCAGCCCCAACAAGC 
GAGGCC GACCCACGT ACAACGAGCACATCACCAAGCGCGTGGCCTCCAGCCCAGCGCCG CCCGAAGGC 
CCCAGC C ACCCGCGAGAGCCAAGC AC AC CC CACCGCTACCGCGAGGGGCGG ACCG AGCTGCGC AGGGA 
CAAGTCTCCTGGCCGCCCCCTGGAGCGAGAGAAGTCCCCCGGCCGGATGCTCAGCACGCGGAGAGAGC 
GGTCCCCCGGGAGGCTGTTTGAAGACAGCAGCAGGGGCCGGCTGCCTGCGGGAGCCGTGAGGACCCCG 
CTGTCCCAGGTGAACAAGGTGTGGGACCAGTCTTCAGTATA AATCTCAGCCAGAAAAACCAACTCCTC 
A 



lORF Start: ATG at 1 



jORFStop: TAA at 6160 





SEQ3DNO:2 


2053 aa |MW at 234700; lkD 


NOV la, 
CG106764-01 
Protein 
Sequence 


^ILKFKYGARNPLDAGAAEPIASRASRLNLFFQGKPPFMTQQQMSPLSREGIXiDALFVLFEECSQPALM 
K IKHVSNFVRKC SDTIAELQELQPSAKDFEVRSL VGCGHFAEVQVVREKATGD I YAMKVMKKKALIjAQ 
EQVSFFEEERN IL SR STS PWI PQLQYAFQDKNHL YLVMEYQPGGDLLSLLNR YEDQLDENL IQFYLAE 
LILAVHSVHLMGYVHRDIKPENILVDRTGHIKLVDFGSAAKMNSNKVNAKLPIGTPDYMAPEVL 
GDGKGT YGLDCDWWS VGV I AYEMI YGR S PFAEGTS ARTFNNIMNFQRFLKFPDDPKVS SDFLDL I QSL 
LCGQKERLKFEGLCCHPFFSKIDWNNIRNAPPPFVPTLKSDDDTSNFDEPEKNSWVSSSPCQLSPSGF 
SGEEL PFVGFS YSKALG X LGRSES WSGLDSPAKTS SMEKKLLIKSKELQDSQDKCHKMEQEMTRLHR 
RVSEVEAVLSQIO^LKASETQRSIiEQDIATYITECSSLKRSLEQARMEVSQEDDKALQIJ, 
SRKLQEIKEQEYQAQVEEMRLMMNQLEEDLVSARRRSDLYESELRESR^ 

KDQGKPEVGEYAKLEKINAEQQLKIQELQEKLEKAVKASTEATELLQNIRQAKERAERELEKLQNRED 
S SEGIRKKLVEAEERRHS LENKVKRLETMERRENRLKDDI OTKSOO IOOMADKI LELE EKHREAOVS A 



99 



WO 03/029424 



PCT/US02/31373 



QHLEVHLKQKEQHYEEKIKVLDNQIKKDL^ 

SLEQRIVELSEANKLAANSSLFTQRNMKAQEEMISELRQQKFYLETQAGKLEAQNRKLEEQLEKISHQ 
DHSDKI^LELETRLREVSLEHEEQKLELKRQLTELQLSLQERESQLTALQATiRAALESQLRQAKTEL 
EETTAEAEEEIQALTAHRDEIQRKFDALRNSCTVITDLEEQLNQLTEDNAELNNQNFYLSKQLDEASG 
ANDEIVQLRSEVDHLRREITEREMQLTSQKQTMEALKTTC 

S VLGDEKSQFECRVRELQRMLDTEKQSRARADQRITE SRQWEL AVKEHKAE I LALQQALKEQKLKAE 

S L SDKLNDLEKKHAMLEMNAR SLQQKLETERELKQRLLEEQAKLQQQMDLQKNH I FRLTQGLQEALDR 

ADLLKTERSDLEYQLEN I QVL YSHEKVKMEGTI SQQTKL I DFLQAKMDQ PAKKKKVPLQ YNELKLALE 

KEKARCAEIjEEALQKTRI ELRS AREEAAHRKATDHPHPSTPATARQQ I AMS AI VR S PEHQP SAMSLLA 

PPSSRRKESSTPEEFSRRiKERMHHNIPHRFNVGLNMRATKCAVCLDTVHFGRQASKCLEC^ 

C S TCL PATCGLPAE YATHFTEAFCRDKMNS PGLQTKEPS SSLHLEGWMKVPRNNKRGQQGTORKYI VL 

EGSKVLIYDNEAREAGQRPVEEFELCLPDGOTSIHGAVGASELAOTAKADVPYILKK^ 

RTLYLLAPSFPDKQRVAH'ALESWAGGRVSREKAEADAKLLGNSLLKLEGDDRLDMNCTLPFSDQVVL 

VGTEEGLYALNVLKNSLTHVPGIGAVFQIYIIKDLEKLLMIAGEERALCLVDVKKVKQSLAQSHLPAQ 

PDISPNIFEAVKGCHLFGAGKIENGLCICAAMPSKWILRYNENLSKYCIRKEIETSEPCSCIHFTNY 

S I L I GTNKFYE I DMKQYTLEEFLDKNDH SLAPAVFAASSNS F P VS I VQVNSAGQREEYLLC FHEFGVF 

VD S YGRRSRTDDLKWSRLPLAF AYREPYLFVTHFNSLEVIE IQAR S S AGTPARAYLDI PNPRYLGPA I 

SSGAIYIASSYQDKLRVICCKGNLVKESGTEHHRGPSTSRSSPNKRGPPTYNEHITKRVASSPAPPEG 

PSHPREPSTPHRYREGRTELRRDKSPGRPLEREKS PGRMLSTRRERS PGRLFEDS SRGRLPAGAVRTP 

LSQVNKVWDQSSV 



SEQ ID NO: 3 



1870 bp 



NOVlb, 
268667493 
DNA Sequence 



CACCGGTACCACCATGTTGAAGTTCAAATATGGAGCGCGGAATCCTTTGGATGCTGGTGCTGCTGAA 



AGCAGATGTCTCCTCTTTCCCGAGAAGGGATATTAGATGCCCTCTTTGTTCTCTTTGAAGAATGCAG 

TC AGCC TGCTCTGATGAAG ATTAAGC ACGTGAGCAACTTTGTCCGGAAGT ATTC CGACACC ATAGCT 

GAGTTACAGGAGCTCCAGCCTTCGGC^AAGKSACTTCGAAGTCAGAAGTCTTGTAGGTTGTGGTCACT 

TTGCTG AAGTGCAGG TGG T AAGAGAG AAAG C AACCGGGGAC ATC T ATGCT ATGAAAG TGATGAAG AA 

GAAGGCTTTATTGGCCCAGGAGCAGGTTTCATTTTTTGAGGAAGAGCGGAACATATTATCTCGAAGC 

ACAAGCCCGTGGATCCCCCTU^TTACAGTATGCCTTTCAGGACAAAAATCACCTTTATCTGGTCATGG 

AATATCAGCCTGGAGGGGACTTGCTGTCACTTTTGAATAGATATGAGGACCAGTTAGATGAAAACCT 

G ATACAGTTTT AC CT AGC TGAGC TG ATTTTGGC TGTTCACAGC GTTC ATC TGATGGGAT ACGTGCAT 

CGAGAC ATC AAGCC TG AGAACATTC TCGT TGAC CGCACAGG AC AC ATC AAGC TGGTGGATT TTGGAT 

CTGCCGCGAAAATGAATTCAAACAAGATGGTGAATGCCAAACTCCCGATTGGGACCCCAGATTACAT 

GGCTCCTGAAGTGCTGACTGTGATGAACGGGGATGGAAAAGGCACCTACGGCCTGGACTGTGACTGG 

TGGTCAGTGGGCGTGATTGCCTATGAGATGATTT ATGGGAGATCCCCC TTCGCAGAGGGAACC TCTG 

CCAGAACCTTCAATAACATTATGAATTTCCAGCGGTTTTTGAAATTTCCAGATC 

CAGTGACTTTCTTGATCTGATTCAAAGCTTGTTGTGCGGCCAGAAAGAGAGACTG 

CTTTGCTGCCATCCTTTCTTCTCTAAAATTCACTGGAACAACATTCGTAACTCTCCTCCCCCCTTCG 

TTCCCACCCTCAAGTCTCACGAT^^CACCTCCAATTTTCATGAACCAGAGAAGAAT/PCGTGGGTTTC 

ATCCTCTCCGTGCCAGCTCAGCCCCTCAGGCTTCTCGGGTGAAGAACTGCCGTTTGTGGGGTTTTCG 

TACAGC AAGGC ACTGGGGATTCTTGGTAGATC TGAGTCTGTTGTGTCGGGTCTGGAC TCCCCTGCCA 

AGAC TAGCTCC ATGGAAAAGAAAC TTCTC ATC AAAAGCAAAGAGC T AC AAG ACTCTC AGGACAAGTG 

TCACAAGATGGAGCAGGAAATGACCCGGTTACATCGGAGAGTGTCAGAGGTGGAGGCTGTGCTTAGT 

CAGAAGGAGGTGGAGCTGAAGGCCTCTGAGACTC AGAGATCCC TC C TGGAGCAGGACCTTGCTACCT 

AC ATC AC AG AATGCAG T AGC TT AAAGCGAAGT T TGGAGC AAGCAC GGATGG AGGTGTCC CAGGAGGA 

TCACAAAGCACTGCAGCTTCTCCATGATATCAGAGAGCAGAGCCGGAAGCTCCAAGAAATCAAAGAG 

CAGGAGTACCAGGCTCAAGTGGAAGAAATGAGGTTGATGATGAATCAGTTGGAAGAGGATCTTGTCT 

CAG C AAG AAGACGGAGTGATC TCTACGAATC TGAGC TGAGAGAGTC T C GGC TTGCTGC TG AAGAATT 

CAAGCGGAAAGCGACAGAATGTCAGCATAAACTGTTGAAGGCTAAGGATCAGGTCGACGGC 



ORF Start: at 2 



ORF Stop: end of sequence 





SEQ ID NO: 4 |623 aa |MW at 70970.0kD 


NOVlb, 
268667493 
Protein 
Sequence 


TGTTMLKFKYGAROTLDAGAAEPIJVSRASRLNLFFQGKPPFOTQQQMSPLSREGILDALFVLFEEC^ 
QPALMKIKHVSNFVRKYSDT IAELQELQPSAKDFEVRS LVGCGHFAEVQWREKATGDI YAMKVMKK 
KALLAQEQVSFFEEERNIL SRSTS PWI PQLQYAFQDKNHLYLVMEYQ PGGDLLSLLNRYEDQLDENL 
IQFYLAELILAVHSVHLMGYVHRDIKPENILVDRTGHI 
APEVLTVMNGTOKGTYGIjDCDWWSVGVIAYEMIYGROT 
SDFLDLIQSIXCGQKERLKFEGLCCHPFFSKIDWNNIRNSPPPFVP 

SSPCQLSPSGFSGEELPFVGFSYSKALGILGRSESWSGLDSPAKTSSMEKKLLIKSKELQDSQDKC 
HKMEOEWPTRLHRRVS EVEAVLSOKEVELKASETORSLLEODLATYI TEC S SLKRSLEOARMEVSOED 
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DKALQLLHDIREQ^^ 

KRKATECQHKLLKAKDQVDG 



|SEQIDNO:5 |2497bp | 


NOVlc, 
268667539 
DNA Sequence 


CACCGGTACCCAGGGGAAGCCTGAAGTGGGAGAATATGCGAAACTGGAGAAGATCAATGCTGAGCAGC 
AGCTCAAAATTCAGGAGCTCCAAGAGAAACTGGAGAAGGCTGTAAAAGCCAGCACGGAGGCCACCGAG 
CTGCTGCAGAATATCCGCCAGGCAAAGGAGCGAGCCGAGAGGGAGCTGGAGAAGCTGCAGAACCGAGA 
GG ATTC TTCTG AAGGCATCAGAAAGAAGCTGGTGGAAGCTGAGGAACGCCGCC ATTC TCTGGAGAACA 
AGGTAAAGAGACTAGAGACCATGGAGCGTAGAGAAAACAGACTGAAGGATGACATCCAGACAAAATCC 
C AACAGATCC AGCAGATGGCTGATAAAATTC TGGAGCTCGAAGAGAAAC ATCGGGAGGCC C AAGTCTC 
AGCCCAGCACCTAGAAGTGCACCTGAAACAGAAAGAGCAGCACTATGAGGAAAAGATTAAAGTGTTGG 
ACAATCAGATAAAGAAAGACCTGGCTGACAAGGAGACACTGGAGAACATGATGCAGAGACACGAGGAG 
GAGGCCCATGAGAAGGGCAAAATTCTCAGCGAACAGAAGGCGATGATCAATGCTATGGATTCCAAGAT 
CAGATCCCTGGAACAGAGGATTGTGGAACTGTCTGAAGCCAATAAACTTGCAGCAAATAGCAGTCTTT 
TTACCCAAAGGAACATGAAGGCCCAAGAAGAGATGATTTCTGAACTCAGGCAACAGAAATTTTACCTG 
G AGACAC AGGCTGGG AAGTTG G AGGC CC AGAAC CGAAAAC TGGAGGAGCAGC TGG AGAAG ATC AGC C A 
CCAAGACCACAGTGACAAGAATCGGCTGCTGGAACTGGAGACAAGATTGCGGGAGGTCAGTCTAGAGC 
ACGAGGAGCAGAAACTGGAGCTC AAGCGCC AGC TCACAGAGCTACAGCTCTC C CTGC AGGAGCGCGAG 
TCACAGTTGACAGCCCTGCAGGCTGCACGGGCGGCCCTGGAGAGCCAGCTTCGCCAGGCGAAGACAGA 
GCTGGAAGAGACCACAGCAGAAGCTGAAGAGGAGATCCAGGCACTCACGGCACATAGAGATGAAATCC 
AGCGCAAATTTGATGCTCTTCGTAACAGCTGTACTGTAATCACAGACCTGGAGGAGCAGCTAAACCAG 
C TGACCGAGGACAACGCTG AAC TC AACAACCAAAACTTCTACTTGTCC AAAC AAC TCGATG AGGCTTC 
TG GCGC CAAC G ACG AGATTGTACAAC TGCGAAGTGAAGTGGAC CATC TC C G C C GGG AGATC AC GGAAC 
GAGAGATG C AGC TT ACCAGCCAG AAGC AAACGATGGAGGC TCTGAAGACC ACG TGCAC CATGC TGG AG 
GAACAGGTCATGGATOTGGAGGCCCTAAACGATGAGCTGCTAGAAAAAGAGCGGCAGTGGGAGGCCTG 
GAGGAGCGTCCTGGGTGATGAGAAATCCCAGTTTGAGTGTCGGGTTCGAGAGCTGCAGAGAATGCTGG 
ACAC CGAGAAACAG AGCAGGG CGAGAGC CGATCAGCGGATC ACCGAGTC TCGC CAGG TGGTGGAGCTG 
GCAGTGAAGGAGCAC AAGGCTGAGATTCTCGCTC TGCAGCAGGCTCTC AAAGAG C AGAAGCTGAAGGC 
CGAGAGCCTCTCTGACAAGCTCAATGACCTGGAGAAGAAGCATGCTATGCTTGAAATGAATGCCCGAA 
GCTTACAGCAGAAGCTGGAGACTGAACGAGAGCTCAAACAGAGGCTTCTGGAAGAGCAAGCCAAATTA 
CAGC AGCAGATG G ACC TGCAGAAAAATC ACAT TTTC CGTC TGAC TCAAGG AC TGC AAG AAGC TC TAGA 
TCGGGCTGATCTACTGAAGAGAGAAAGAAGTGACTTGGAGTATCAGCTGGAA 

ATTC TC ATGAAAAGGTG AAAATGG AAGGCAC TATTTCTC AAC AAAC C AAAC TCATTGATTTTC TGC AA 

GCCAAAATGGACCAACCTGCTAAAAAGAAAAAGGTTCCTCTGCAGTACAATGAGCTGAAGCTGGCCCT 

GGAGAAGGAGAAAGCTCGCTGTGCAGAGCTAGAGGAAGCCCTTCAGAAGACCCGCATCGAGCTCCGGT 

CCGCCCGGGAGGAAGCTGCCCACCGCAAAGCAACGGACCACCCACACCCATCCACGCCAGCCACCGCG 

AGGCAGCAGATCGCCATGTCCGCCATCGTGCGGTCGCCAGAGCACCAGCCGAGTC 

GGCCCCGCCATCCAGCCGCAGAAAGGAGTCTTCAACTCCAGAGGAATTTAGTCGGCGTCTTAAGGAAC 

GCATGCACCACAATATTCCTCACCG^TTCAACGTAGGACTGAACATGCGAGCCACAAAGTGTGCTGTG 

TGTCTGGATACCGTGCACTTTGGACGCCAGGCATCCAAATGTCTCGAATGTCAGGTGATGTGTC^ 

CAAGTGCTCCACGTGCTTGCCAGCCACCTGCGGCTTGCCTGTCGACGGC 




ORF Start: at 2 |ORF Stop: end of sequence 





SEQ ID NO: 6 |832 aa |MW at 96885.8kD 


NOVlc, 
268667539 
Protein 
Sequence 


TGTQGKPEVGEYAKLEKINAEQQLKIQELQEKLEKAVKASTEATELLQNIRQAKERAERELEKLQNRE 
DSSEGIRKKLVEAEERRHSLEbHCVKRLETMERRENRLKD^ 

AQHLEVHLKQKEQHYEEKI KVLDNQIKKDL ADKETLENMMQRHEEEAHEKGK ILS EQKAMINAMDSKI 
RSLEQR I VELS EANKLAANSSLFTQRNMKAQEEMI S ELRQQKFYLETQ AGKL EAQNRKLEEQLEK I SH 
QDHSDKNRLLELETRIiREVSLEHEEQKLELKRQLTELQLSLQERESQLTALQAARAALESQLRQAKTE 
LEETTAEAEEEIQALTAHRDEIQRKFDALRNSCTVITDLEEQ 
GANDEIVQLRSEVDHLRREITEREMQLTSQKQTMEALK^ 

RSVLGDEKSQPECRVRELQRMLDTEKQSRARADQR I TESRQWELAVKEHKAE I LALQQALKEQKLKA 
ESLSDKLNDLEKKHAMLE^ARSLQQKLETERELKQRLLEEQAKLQQQMDLQKNHIFRLTQGLQEALD 
RADLLKTERSDLEYQLENIQVLYSHEKVKMEGTISQQTKLIDPLQAKMDQPAKKKKVPLQYNEI^ 
EKEKARCAELEEALQKTRI ELRS AREEAAHRKATDHPHPSTPATARQQIAMS AI VRS PEHQP S AMSLL 
APPS SRRKESSTPEEFSRRLKERMHHNI PHRFNVGLNMRATKCAVCLDTVHFGRQASKCLECQVMCHP 
KCSTCLPATCGLPVDG 
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SEQ ID NO: 7 


2542 bp j 


NOVld, 
268667543 
DNA Sequence 


CACCGGTACCCAGGGGAAGCCTGAAGTGGGAGAATATGCGAAACTGGAGAAGATCAATGCTGAGCAG 

CAGCTCAAAATTCAGGAGCTCCAAGAGAAACTGGAGAAGGCTGTAAAAGCCAGCACGGAGGCCACCG 

AGCTGCTGCAGAATATCCGCCAGGCAAAGGAGCGAGCCGAGAGGGAGCTGGAGAAGCTGCAGAACCG 

AGAGGAT TC TT C TG AAGGC AT C AG AAAG AAG C TG GTGG AAGCTGAGG AACGCCGCCAT TCTCTGGAG 

AACAAGGTAAAGAG ACTAGAGAC C ATGGAGCGTAGAGAAAACAGAC TGAAGGATGAC ATCC AGACAA 

AATCCCAACAGATCCAGCAGATGGCTGATAAAATTCTGGAGCTCGAAGAGAAACATCGGGAGGCCCA 

AGTCTCAGCCCAGCACCTAGAAGTGCACCTGAAACAGAAAGAGC^GCACTATGAGGAAAAGATTAAA 

G TGTTGGAC AATC AG AT AAAGAAAG ACC TGGC TG AC AAGGAGACAC TGG AGAAC ATG ATGC AGAGAC 

ACGAGGAGGAGGCCCATGAGAAGGGCAAAATTCTCAGCGAACAGAAGGCGATGATCAATGCTATGGA 

TTCCAAGATCAGATCCCTGGAACAGAGGATTGTGGAACTGTCTGAAGCCAATAAACTTGCAGCAAAT 

AGCAGTCTTTTTACCCAAAGGAACATGAAGGCCCAAGAAGAGATGATTTCTGAACTCAGGCAACAGA 

AATTTTAC CTGGAG ACACAGG C TGGG AAGTTGGAGGCCC AGAACCGAAAACTGG AGGAGC AGCTGG A 

GAAGATCAGCCACCAAGACCACAGTGACAAGAATCGGCTGCTGGAACTGGAGACAAGATTGCGGGAG 

GTC AGTC TAGAGCACG AGGAGC AG AAACTGGAGCTCAAGCGCCAGC TC ACAGAGCTAC AGCTCTCCC 

TGCAGGAGCGCGAGTCACAGTTGACAGCCCTGCAGGCTGCACGGGCGGCCCTGGAGAGCCAGCTTCG 

CCAGGCGAAGACAGAGCTGGAAGAGACCACAGCAGAAGCTGAAGAGGAGATCCAGGCACTCACGGCA 

CATAGAGATGAAATCCAGCGCAAATTTGATGCTCTTCGTAACAGCTGTACTGTAATCACAGACCTGG 

AGGAGCAGCTAAACCAGCTGACCGAGGACAACGCTGAACTCAACAACCAAAACTTCTACTTGTCCAA 

ACAACTCGATGAGGCTTCTGGCGCCAACGACGAGATTGTACAACTGCGAAGTGAAGTGGACCATCTC 

CGCCGGGAGATCACGGAACGAGAGATGCAGCTTACCAGCCAGAAGCAAACGATGGAGfGCTCTGAAGA 

CCACGTGCACCATGCTGGAGGAACAGGTCATGGATTTGGAGGCCCTAAACGATGAGCTGCTAGAAAA 

AGAGCGGCAGTGGGAGGCCTGGAGGAGCGTCCTGGGTGAT<^GAAATCCCAGTTTGAGTGTCGGGTT 

CGAGAGCTGCAGAGGATGCTGGACACCGAGAAACAGAGCAGGGCGAGAGCCGATCAGCGGATCACCG 

AGTCTCGCCAGGTGGTGGAGCTGGC^GTGAAGGAGCACAAGGCTGAfiATTCTCGCTCTGCAGC 

TC TCAAAG AGC AGAAG C TG AAGGC CG AG AGC C TC TCTG AC AAGCTC AATGACC T GG AGAAG AAGCAT 

GCTATGCTTGAAATGAATGCCCGAAGCTTACAGCAGAAGCTGGAGACT<3AACGAGAGCTCAAACAGA 

GGCTTCTGGAAGAGCAAGCCAAATTAC^GCAGCAGATGGACCTGCAGAAAAATCACATTTTCCGTCT 

G ACTC AAGG ACTGCAAGAAGCTCT AG ATCGGGC TGATC TAC TG AAGACAGAAAG AAGTG ACTTGG AG 

TATCAGCTGGAAAACATTCAGGTTCTCTATTCTCATGAAAAGGTGAAAATGGAAGGCACTATTTCTC 

aacaaaccaaactcattgattttctgcaagccaaaatggaccaa 

atttagtcgacggaaagaggaccctgctttacccacacaggttcctctgcagtacaatgagctgaag 
ctggccctggagaaggagaaagctcgctgtgcagagctagaggaagcccttcagaagacccgcatcg 
agctccggtccgcccgggaggaagctgcccaccgcaaagcaacggaccacccacacccatccacgcc 
agcc^ccgcgaggcagcagatcgccatgtctgccatcgtgcggtcgccagagcaccagcccagtgcc 
atgagcctg<:tggccccgccatccagccgcagaaaggagtcttcaactccagaggaatttagtcggc 
gtcttaaggaacgcatgcaccacaatattcctcaccgattcaacgtaggactgaacatgcgagccac 
aaagtgtgctgtgtgtctggataccgtgcactttggacgccaggcatccaaatgtctcgaatgtcag 
gtgatgtgtcaccccaagtgctccacgtgcttgccagccacctgcggcttgcctgtcgacggc 




ORF Start: at 2 |oRF Stop: end of sequence 





SEQ ID NO: 8 847 aa 


MWat98582.7kD 


NOVld, 
268667543 
Protein 
Sequence 


tgtqgkpevgeyaklekinaeqqlkiqelqeklekavkasteatellqnirqakeraereleklqnk 
edssegirkklveaeerrhslenkvkrletmerre^ 

vs aqhlevhlkqkeqhyeek i kvldnqikkdi*adketlenmmqrheeeahekgk i ls eqkaminamd 
skirsleqrivelseanklaansslftqrl^aqeemiselrqqkfyletqagkleaqnrkleeqle 
kishqdhsdknrlleletrlrevsleheeqklelkrqltelqlslqeresqltalqaaraalesqlr 

QAXTELEETTAEAEEEIQALTAHRDEIQRKFDALRNSCW 

QLDEASGANDE WQLRSEVDHLRREI TEREMQLTSQKQTMEALKTTCTMLEEQVMDLEALNDELLEK 

erqweawrsvlgdeksqfecrvrelqrmlmekqsraradq 
lkeqklkaeslsdklndlekkhal^emnarslqq^ 

tqglqealdradllktfjisdle yqleniqvl yshekvkmegti sqqtkl idflqakmdqp akkkkgl 
fsrrked pal ptqvplqynelkiialekelcarcaeleealqktri elrs areeaahrkatdhphpstp 
atarqqiamsaivrspehqpsamsllappssrrkesstpeefsrrlkermhhniphrfntvgln^ 
kcavcldtvhfgrqaskclecqvmchpkcstclpatcglpvdg 



jl870bp 



|SEQ BP NO: 9 
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NOVle, 
268667555 
DNA Sequence 



CACCGGTACCTGCGGCTTGCCTC ; 
TGAACTCCCCAGGTCTCCAGACCAAGGAGCCCAGCAGCAGCTTGCACCTGGAAGGGTGGATGAAGGTG 
CCCAGGAATAACAAACGAGGACAGCAAGGCTGGGACAGGAAGTACATTGTCCTGGAGGGATCAAAAGT 
CCTCATTTATGACAATGAAGCCAGAGAAGCTGGACAGAGGCCGGTGGAAGAATTTGAGCTGTGCCTTC 
CCGACGGGGATGTATCTATTCATGGTGCCGTTGGTGCTTCCGAA 

GTCCCATACATAC1X3AAGATGGAATCTCACCCGCACACCACCTGCTGGCCCGGGAGAACCCTCTACTT 
GCTAGCTCCCAGCTTCCCTGACAAACAGCGCTGGGTCACCGCCTTAGAATCAGTTGTCGCAGGTGGGA 
GAGTTTC TAGGGAAAAAGC AGAAGCTGATGCTAAACTGC TTGGAAACTCC CTGCTGAAACTGGAAGGT 



AGGGCTCTACGCCCTGAATGTCTTGAAAAACTCCCTAACCCATGTCCCAGGAATTGGAGCAGTCTTCC 
AAATTTATATTATCAAGGACCTGGAGAAGCTACTCATGATAGCAGGAGAAGAGCGGGCACTGTGTCTT 
GTGGACGTGAAGAAAGTGAAACAGTCCCTGGCCCAGTCCCACCTGCCTGCCCAGCCCGACATCTCACC 
CAACATTTTTGAAGCTGTCAAGGGCTGCCACTTGTTTGGGGCA 

TCTGTGCAGCCATGCCCAGCAAAGTCGTC ATTC TCCGCTAC AACGAAAAC CTC AGCAAATACTGCATC 
CGGAAAGAGATAGAGACCTCAGAGCCCTGCAGCTGTATCCACTTCACCAATTACAGTATCCTCATTGG 
AACC AAT AAATTCTACGAAATCGAC ATGAAGC AGTAC ACGC TCGAGG AATTCC TGGATAAGAATGACC 
ATTCCTTGGCACCTGCTGTGTTTGCCGCCTCTTCCAACAGCTTCCCTGTCTCAATCGTGCAGGTGAAC 
AGCGC AGGGCAGCGAGAGGAGTAC TTGCTGTGTTTCCACGAATTTGGAGTGTTCGTGGATTC TTACGG 
AAGACGTAGCCGCACAGACGATCTCAAGTGGAGTCGCTTACCTTTGGCCTTTGCCTACAGAGAACCCT 
ATCTGTTTGTGACCCACTTCAACTCACTCGAAGTAATTGAGATCCAGGCACGCTCCTC 
CCTGCCCGAGCGTACCTGGACATCCCGAACCCGCGCTACCTGGGCCCTGCCATTTCCTCAGGAGCGAT 
TTAC T TGGCG TCC TCAT ACC AGG AT AAATTAAGG GTC ATT TGCTG CAAGGGAAAC C TC GTG AAGGAGT 
CCGGCACTCAACACCACCGGGGCCCGTCCACCTCCCGCAGCAGCCCC^CAAGCGAGGCCCACCCACG 
TAC AACGAGCAC ATC ACC AAGCGCGTGGC CTCC AGCCCAGCGCCGCCCGAAGGCCCC AGC C ACCCGCG 
AGAGCCAAGCACACCCCACCGCTACCGCGAGGGGCGGACCGAGCTGCGCAGGGACAAGTCTCCTGGCC 
GCCCCCTGGAGCGAGAGAAGTCCCCCGGC CGGATGCTCAGCACGCGGAGAGAGCGGTCCCC CGGGAGG 
CTGTTTGAAGACAGCAGCAGGGGCCGGCTGCCTGCGGGAGCCGTGAGGACCCCGCTGTCCCAGGTGAA 
CAAGGTGTGGGACCAGTCTTCAGTAGTCGACGGC 



|ORF Stop: end of sequence 



ORFStart:at2 





SEQ ID NO: 10 |623 aa |MW at 69278.9kD 


NOVle, 
268667555 
Protein 
Sequence 


TGTCGLPAEYATHFTEAFCRDKMNS PGLQTKEPS S SLHLEGWMKVPRNNKRGQQGWDRKY IVLEGSKV 

LIYDNKAREAGQRPVEEFELCI»PDGDVSIHGAVGASELANTAKAEVPYILKMESH 

UVPSFPDKQRWVTALESWAGGRVSREKAEADAKL^ 

GLYALNVLKNSLTHVPGIGAWQIYIIKDLEKLLMIAGEERALCLVDVKKVKQS 

NIFEAVKGCHLFGAGKIENGLCICAAMPSKWILRYNENLSKYCIRKEIETSEPC 

TNKFYEIDMKQYTIjEEFIiDKNDHSLAPAVF AAS SNSF P VS I VQVNSAGQREEYLLCFHEFGVFVDS YG 

RRSRTDDLKWSRLPLAFAYREPYIiFVTHFNSLEVIEIQARSSAGTPARAYLDIPNPRYLGPAISSGAI 

YLASSYQDKLRVICCKGNLVKESGTEHHRGPSTSRSSPNKRGPPTVNEHITKRVASSPAPPEGPSHPR 

EPSTPHRYREGRTELRRDKSPGRPLEREKSPGRMLSTRRERSPGRLFEDSSRGRLPAGAVRTPLSQVN 

KVWDQSSWDG 





SEQIDNO: 11 


1915 bp | 


NOVlf, 

268667574 \ 
DNA Sequence 


C ACCGGTACCTGCGGC TTGCC TGCTGAATATGCC ACAC ACTTC ACCGAGGCCTTC TGCCGTGATAAAA 
TGAACTCCCCAGGTCTCCAGACCAAGGAGCCCAGCAGCAGCTTGCACCTGGAAGGGTGGATGAAGGTG 
CCCAGGAATAACAAACGAGGACAGCAAGGCTGGGACAGGAAGTACATTGTCCTGGAGGGATCAAAAGT 
CCTCATTTATGACAATGAAGCCAGAGAAGCTGGACAGAGGCCGGT<XjAAGAATTTGAGCTGTGCCTTC 
CCGACGGGGATGTATCTATTCATGGTGCCGTTGGTGCTTCCGAACTCGCA^ 

GTCCC ATACATAC TGAAGATG G AATCTC ACC C GCACACC ACC TGC TGGC CCGGGAGAACCCTCT AC TT 

GCTAGCTCCCAGCTTCCCTGACAAACAGCGCTGGGTCACCGCCTTAGAATCAGTTGTCGCAGG 

GAGTTTCTAGGGAAAAAGCAGAAGCTGATGCTGCCCGCGACTGTGTTTCTTACGAGCTTCTGCCTGCC 

TGGGTTCAGAAACTGCTTGGAAACTCCCTGCTGAAACTGGAAGGTGATGACCGTCTAGACATGAACTG 

CACACTGCCCTTCAGTGACCAGGTGGTGTTGGTGGGCACCGAGGAAGGGCTCTACGCCCTGAATGTCT 

TGAAAAAC TCCCTAACCCATGTCCC AGGAATTGG AGC AGTCTTCC AAATTTATATTATC AAGGACC TG 

GAGAAGCTACTCATGATAGCAGGAGAAGAGCGGGCACTGTGTCTTGTGGACGTGAAGAAAGTGAAACA 

GTCCCTGGCCCAGTCCCACCTGCCTGCCCAGCCCGACATCTCACCCAACATTTTTGAAGCTGTCAAGG 

GCTGCCACTTGTTTGGGGCAGGCAAGATTGAGAACGGGCTCTGCATCTGTGCAGCCATGCCCAGCAAA 

GTCGTCATTCTCCGCTACAACGAAAACCTCAGCAAATACTGCATCCGGAAAGAGATAGAGACCTCAGA 

GCCCTCCAGCTGTATCCACTTCACCAATTACAGTATCCTCATTGGAACCAATAAATTCTACGAAATCG 

ACATGAAGCAGTACACGCTCGAGGAATTCCTGGATAAGAATGACCATTCCTTGGCACCTGCTGTGTTT 

GCCGCCTCTTCCAACAGCTTCCCTGTCTCAATCGTGCAGGTGAACAGCGCAGGGCAGCGAGAGGAGTA 
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l». X X vjVJ X A? X *3 X * X\«wi!HJnni X 1 vsVarvj ± Vj X 1 uVs 1 uun l i v i ■*-**vjv^.\ - .vjv^«v^rt\jrtv»Nj^» •*• s» 

TCAAGTGGAGTCGCTTACCTTTGGCCTTTGCCTACAGAGAACCCTATCTGTTTGTGACCCACTTCAAC 
TY^arTpri anTA ATTGAGATCOAGGC'ACGCTCCTCAGCAGGGACCCCTGCCCGAGCGTACCTGGACAT 
CCCGAACCCGCGCTACCTGGGCCCTGCCATTTCCTCAGGAGCGATTTACTTGGCGTCCTCATACCAGG 
ATAAATTAAGGGTCATTTGCTGCAAGGGAAACCTCGTGAAGGAGTCCGGCACTGAACACCACCGGGGC 
CCGTCCACCTCCCGCAGCAGCCCCAACAAGCGAGGCCCACCCACGTACAACGAGCACATCACCAAGCG 
CGTGGCCTCCAGCCCAGCGCCGCCCGAAGGCCCCAGCCACCCGCGAGAGCCAAGCACACCCCACCGCT 
ACCGCGAGGGGCGGACCGAGCTGCGCAGGGACAAGTCTCCTGGCCGCCCCCTGGAGCGAGAGAAGTCC 
CCCGGCCGGATGCTCAGCACGCGGAGAGAGCGGTCCCCCGGGAGGCTGTTTGAAGACAGCAGCAGGGG 
CCGGCTGCCTGCGGGAGCCGTGAGGACCCCGCTGTCCCAGGTGAACAAGGTGTGGGACCAGTCTTCAG 
TAGTCGACGGC 




ORF Start: at 2 (ORF Stop: end of sequence 





SEQ ID NO: 12 |638 aa |MW at 71010.8kD 


NOVlf, 
268667574 
Protein 
Sequence 


TGTCGL PAEYATHFTEAFCRDKMNS PGLQTKE PS S SLHLEGWMKVPRNNKRGQQGWDRK Y I VLEG SKY 

LIYDNEAREAGQRPVEEFELCLPIXSDVSIHGAVGASELANTAKAI^ 

LAPSFPDKQRWVTALESWAGGRVSREKAEADAARDCVSYELLPAWQKL^ 

TLPFSDQWLVGTEEGLYALNVLKNSLTHVPG IGAVFQIYI IKDLEKLLMI AGEERALCLVDVKKVKQ 
SI^QSHLPAQPDISPNIFEAVKGCHLFGAGKIENGLCICAAMPSKVVIIiRYNENLSKYCIRKEIETSE 
PCSC IHFTNYS ILIGTNKFYEIDMKQYTLEEFLDKITOHSLAPAVFAASSNSFPVSI VQVNSAGQREEY 
LLCFHEFGVFVDSYGRRSRTDDLKWSRLPLAFAYREPYLFVTHFNSLEVIEIQARSSAGTPARAYLDI 
PNPRYLGPAISSGAl^IJ^SYQDKLRVICCKGNLVKESGTEHHRGPSTSRSSPNKRGPPTxT^ITKR 
VASSPAPPEGPSHPREPSTPHRYREGRTELRRDKSPGRPLEREKSPGRMLSTRRERSPGRLFEDSSRG 
RLPAGAVRTPLSQVNKVWDQSSWDG 



|SEQ ID NO: 13 |6201 bp j 


NOVlg, 
CG106764-02 
DNA Sequence 


ATGTTGAAGTTCAAATATGGAGCGCGGAATCCTTTGGATGCTGGTGCT^ 

GGGCCTCCAGGCTGAATCTGTTCTTCCAGGGGAAACCACCCTTTATGACTCAACAGCAGATGTCTCC 


ATGAAGATTAAGCACGTGAGCAACTTTGTCCGGAAGTGTTCCGACACCATAGCTGAGTTACAGGAGC 
TCC AGCC TTCGGCAAAGGAC TTCGAAGTCAG AAGTCTTGTAGGTTGTGGTCACTTTGCTGAAGTGCA 
GG TGGTAAG AG AGAAAG CAACCGGGG AC ATC TATGCT ATG AAAGTGATG AAGAAG AAG G C TTT ATTG 
GCCCAGGAGCAGGTTTCATTTTTTGAGGAAGAGCGGAACATATTATCTCGAAGCACAAGCCCGTGGA 
TCCCCCAATTACAGTATGCCTTTCAGGACAAAAATCACCTTTATCTGGTGATGGAATATCAGCCT^ 
AGGGGACTTGCTGTCACTTTTGAATAGATATGAGGACCAGTTAGATGAAAACCTGATACAGTTTTAC 
CTAGCTGAGCTGATTTTGGCTGTTCACAGCGTTCATCTGATGGGATACGTGCATCGGGACATCAAGC 
CTGAGAAC^TTCTCGTTGACCGGACAGGACACATCAAGCTGGTGGATTTTGGATCTGCCGCGAAAAT 
GAATTCAAACAAGGTGAATGCCAAACTCCCGATTGGGACCCCAGATTACATGGCTCCTGAAGTGCTG 
ACTGTGATGAACGGGGATGGAAAAGGCACCTACGGCCTGGACTGTGACTGGTGGTCAGTGGGCGTGA 
TTGCC T ATGAGATGATTTATGGG AGATCC C CCTTCGCAGAGGGAACCTC TGCC AGAACC TTC AATAA 
CATTATGAATTTCCAGCGGTTTTTGAAATTTCCAGATGACCCCAAAGTGAGCAGTGACTTTCTTGAT 
CTGATTCAAAGCTTGTTGTGCGGCC^GAAAGAGAGACTGAAGTTTGAAGGTCTTTGCTGCCATCC^ 
TCTTCTCTAAAATTGACTGGAACAACATTCGTAACGCTCCTCCCCCCTTCGTTCCCACCCTCAAGTC 
TGACGATGAC ACCTCC AATTTTGATGAAC CAGAGAAGAATTCGTGGGTTTCATCCTC TCCGTGCC AG 
CTGAGCCCCTCAGGCTTCTCGGGTGAAGAACTGCCGTTTGT 

GGATTCTTGGTAGATCTGAGTCTGTTGTGTCGGGTCTGGACTCCCCTGCCAAGACTAGCTCCATGGA 
AAAGAAACTTCTCATCAAAAGCAAAGAGCTACAAGACTCTCAGGACAAGTGTCACAAGATGGAGCAG 
GAAATGACCCGGTTACATCGGAGAGTGTCAGAGGTGGAGGCTGTGCTTAGTCAGAAGGAGGTGGAGC 
TGAAGGCCTCTGAGACTGAGAGATCCCTCCTGGAGCAGGACCTTGC^ 

TAGCTTAAAG CGAAGTTTGGAGCAAG CAC GG ATGGAGGTGTC C C AG GAGGATG AC AAAGCACTGCAG 
CTTCTCCATGATATCAGAGAGCAGAGCCGGAAGCTCCAAGAAATCAAAGAGCAGGAGTACCAGGCTC 
AAGTGGAAGAAATGAGGTTGATGATGAATC AGT TG GAAGAGG ATC TTGTCTCAGCAAGAAGACGGAG 
TGATCTCTACGAATCTGAGCTGAGAGAGTCTCGGCTTGCTGCTGAAGAATTCAAGCGGAAAGCGACA 
GAATGTCAGCATAAACTGTTGAAGGCTAAGGATCAGGGGAAGCCTGAAGTGGGAGAATATGCGAAAC 
TGGAG AAGATC AATGC TGAGCAG CAGC TC AAAATTCAGGAGC TCC AAGAGAAAC TGGAGAAGGCTGT 
AAAAGCCAG C ACGG AGGCC AC C GAGC TGC TGC AGAAT ATCCGC C AGGC AAAGGAGCG AG C CGAG AGG 
GAGCTGGAGAAGCTGCAG AAC CGAG AGGATT CT TC TGAAGGC ATC AG AAAG AAGC TGGTGG AAGCTG 
AGG AACGC CGC GATTC TCTGGAG AACAAGGT AAAG AGACTAG AGACC ATGG AGCGT AGAGAAAACAG 
AC TG AAGG ATGAC ATCCAGACAAAATC C C AAC AGAT C CAGC AG ATGGCTG ATAAAATTC TGGAGCTC 
G AAG AG AAAC ATCGGGAGGC C C AAGTCTCAG C C CAG CAC C TAGAAGTGCACC TGAAACAG AAAGAGC 
AGCACTATGAGGAAAAGATTAAAGTATTGGACAATCAGATAAAGAAAGACCTGGCTGACAAGGAGAC 
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ACTGGAGAACATGATGCAGAGACACGAGGAGGAGGCCl^TO 

AAGGCGATGATCAATGCTATGGATTCCAAGATCAGATCCCTGGAACAGAGGATTGTGGAACTGTCTG 

AAGC C AATAAAC TTGCAGCAAATAGCAGTCTTTTTACCCAAAGGAAC ATGAAGGCCC AAGAAGAGAT 

G ATTTC TGAAC TC AGGCAACAGAAATTTTAC C TGGAGAC ACAGGCTGGGAAGTTGGAGGCCCAGAAC 

C G AAAAC TGGAGG AGC AGC TGG AG AAG ATC AGCC ACC AAGACC AC AGTGAC AAG AATC GGCTG C TGG 

AACTGGAGACAAGATTGCGGGAGGTGAGTCTAGAGCACGAGGAGCAGAAACTGGAGCTCAAGCGCCA 

GCTCAC AGAGCT ACAGC TCTCCC TGCAGG AGCGCG AGTCACAG TTGACAGCCC TGCAGGCTG CACGG 

GCGGCCCTGGAGAGCCAGCTTCGCCAGGCGAAGACAGAGCTGGAAGAGACCACAGCAGAAGCTGAAG 

AGGAGATCCAGGCACTCACGGCACATAGAGATGAAATCCAGCGCAAATTTGATGCTCTTCGTAACAG 

C TGT AC TGTGATC AC AG AC CTGGAGGAGCAGCTAAAC CAGCTGACCGAGGACAACGC TGAACTC AAC 

AACCAAAACTTCTACTTGTCCAAACAACTCGATGAGGCTTCTGGCGCCAACGACGAGATTGTACAAC 

TGCGAAGTGAAGTGGACCATCTCCGCCGGGAGATCACGGAACGAGAGATGCAGCTTACCAGCCAGAA 

GCAAACGATGGAGGCTCTGAAGACCACGTGCACCATGCTGGAGGAACAGGTCATGGATTTGGAGGCC 

CTAAACGATGAGCTGCTAGAAAAAGAGCGGCAGTGGGAGGCCTGGAGGAGCGTCCTGGGTGATGAGA 

AATCCCAGTTTGAGTGTCGGGTTCGAGAGCTGCAGAGGATGCTGGACACCGAGAAACAGAGCAGGGC 

GAGAGCCGATCAGCGGATCACCGAGTCTCGCCAGGTGGTGGAGCTGGCAGTGAAGGAGCACAAGGCT 

GAGATTCTCGCTCTGCAGCAGGCTCTCAAAGAGCAGAAGCTGAAGGCCGAGAGCCTCTCTGACAAGC 

TC AATG AC C TGG AG AAG AAGC ATG C T ATGC TTGAAATGAATG CC CG AAGC TTAC AGCAG AAGC TGG A 

GACTGAACGAGAGCTCAAACAGAGGCTTCTGGAAGAGCAAGCCAAATTACAGCAGCAGATGGACCTG 

CAG AAAAATC ACATTTTCCGTCTGACTCAAGGAC TGC AAGAAGC TCTAGATCGGGCTGATC TACTG A 

AGACAGAAAGAAGTGACTTGGAGTATCAGCTGGAAAACATTCAGGTGCTCTATTCTCATGAAAAGGT 

GAAAATGGAAGGCACTATTTCTCAACAAACCAAACTCATTGATTTTCTGCAAGCCAAA^ 

C C TGC TAAAAAG AAAAAGGTGCC TCTGCAGTAC AATG AGC TG AAG C TGGC CC TGGAG AAGGAG AAAG 

CTCGCTGTGCAGAGCTAGAGGAAGCCCTTCAGAAGACCCGCATCGAGCTCCGGTCCGCCCGGGAGGA 

AGCTGCCCACCGCAAAGCAACGGACCACCCACACCCATCCACGCCAGCCACCGCGAGGCAGCAGATC 

GCCATGTCTGCCATC GTGCGGTCGCC AGAGCACCAGCCCAGTGC CATGAGCCTGCTGGCCC CGCCAT 

CCAGCCGCAGAAAGGAGTCTTCAACTCCAGAGGAATTTAGTCGGCGTCTTAAGGAACGCATGCACCA 

CAATATTCCTCACCGATTCAACGTAGGACTGAACATGCGAGCCACAAAGTGTGCTGTGTGTCTGGAT 

ACCGTGCACTTTGGACGCCAGGCATCCAAATGTCTAGAATGTCAGGTGATGTGTCACCCCAAGTGCT 

CCACGTGC TTGCCAGCCACCTGCGGCTTGCCTGCTG AATATGCC AC ACAC TTCACCGAGGCCTTCTG 

CCGTGACAAAATGAACTCCCC^GGTCTCCAGACCAAGGAGCCCAGC^GCAGCTTGCACCTGGAAGGG 

TGGATGAAGGTGCCCAGGAATAACAAACGAGGACAGCAAGGCTGGGACAGGAAGTACATTGTCCTGG 

AGGGATCAAAAGTCCTCATTTATGACAATGAAGCCAGAGAAGCTGGACAGAGGCCGGTGGAAGAATT 

TGAGCTGTGCCTTCCCGACGGGGATGTATCTATTCATGGTGCCGTTGGTGCTTCCGAACTCGCAAAT 

ACAGCCAAAGCAGATGTCCCATACATACTGAAGATGGAATCTCACCCGCACACCACCTGCTGGCCCG 

GGAGAACCCTCTACTTGCTAGCTL.ee AGCTTCCCTGAC AAAC- AGCGC 1 GGGTCACCGCCTTAGAATC 

AGTTGTCGCAGGTGGGAGAGTTTCTAGGGAAAAAGCAGAAGCTGATGCTAAACTGCTTGGAAACTCC 

ctgctgaaactggaaggtgatgaccgtctagacatgaactgcacgctgcccttcagtgaccaggtag 

tgttggtgggcaccgaggaagggctctacgccctoaatgtcttgaaaaactccctaacccatgtcl^ 

ac^aattggagcagtcttccaaatttatattatcaaggacctggagaagctactcatgatagcaggt 

gaagagcgggcactgtgtcttgtggacgtgaagaaagtgaaacagtccctggcccagtcccacct^ 

ctgcccagcccgacwtctcaccc^catttttcaagc^ 

caagattgagaacgggctctgcatctgtgcagccatgccc^^ 

gaaaacctcagcaaatactccatccggaaagagatagagacctc^^ 

tcaccaattacagtatcctcattcgaaccaat^^ 

CGAGGaATTCCTGGATAAGAAT<3ACCATTCCTT<3GC7^ 




AATTTGGAGTGTTCGTX^ATTCTTACGGAAGACGTAGCCGCACAGACGATCTCaAGTGGAGTCGCTT 
ACCTTTGGCCTTTGCCTACAGAGAACCCTATCTGTTTGTGACCCACTTCAACTCACTCGAAGTaATT 
GAGATCCAGGCACGCTCCTCAGCAGGGACCCCTGCCCGAGCGTACCTGGACATCCCGaACCCGCGCT 

acctgggccctgccatttcctcagcagcgatttactt<mcgtcctcataccaggataaattaagggt 

catttgctgcaagggaaacctcgtgaaggagtccggcactgaacaccaccggggcccgtccacctcc 

cgcagcagccccaacaagcgaggcccacccacgtacaacgagcacatcaccaagcgcgtggcctcca 

gcccagcgccgcccgaaggccccagccacccgcgagagccaagcacaccccaccgctaccgcgaggg 

gcggaccgagctgcgcagggacaagtctcctggccgccccctggagcgagagaagtcccccggccgg 

atgctcagcacgcggagagagcggtcccccgggaggctgtttgaagacagc^^ 

ctgcgggagccgtgaggaccccgctgtcccaggtgaacaaggtga^^ 

gtctgttgcggaggccaggagtgacttggggaactga 




ORF Start: ATG at 1 j |ORF Stop: TGA at 6199 





SEQ ID NO: 14 |2066 aa |MW at 236008.5kD 


NOVlg, 
CG106764-02 
Protein 
Sequence 


MLKFKYGARNPLDAGAAEPIASRASRLNLFFQGKPPFMTQQQMSPLSREGILDALFVIiFEECSQPAL 
MKIKHVSNFVRKCSDTIAELQELQPSAKDFEVR^ 

AQEQVS FFEEERNI LSRSTS PWI PQLQYAFQDKNHL YLVMEYQ PGGDLLSLLNR YEDQLDENL IQFY 
L AEL I LAVHS VHLMGYVHRDIKPENI LVDRTGH IKLVDFGS AAKMNSNKVNAKLP IGT PDYMAPEVL 
TVMNGDGKGTYGLDCDWWSVGVIAYEMIYGRSPFAEGTSARTFNNIMNFQRFLKFPDDPKVSSDFLD 
LIQSLLCGQKERLKFEGLCCHPFFSKIDWNNIRNAPPPFVPTLKSDDDTSNFDEPEKNSWVSSSPCQ 
LSPSGFSGEELPFVGFSYSKALGILGRSESWSGLDSPAKTSSMEKKLLIKSKELODSODKCHKMEO 
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lEMTRLHRRVSEVEAVLSQKEVELKASETQRSLLEQDlkTtlT 
LLHDIREQSRKLQEIKEQEYQAQVEEMRLMMNQLEEDLVSARRRSDLY^ 

ECQHKLLKAKDQGKPEVGE YAKLEK INAEQQLKIQELQEKLEKAVKASTEATELLQNI RQAKERAER 
ELEKLQNREDSSEGIRKKLVEAEERRHSLENKVKRLETMER 

EEKHREAQ V S AQHL EVHLKQKEQHYEEK I KVLDNQ I KKDLADKETLENMMQRHEEEAHEKGK I L S EQ 
KAMINAMDSKIRSLEQR I VELSEANKLAANS SLFTQRNMKAQEEMI SELRQQKF YLETQAGKLEAQN 
RKLEEQLEKISHQDHSDKNPJiLELETRXtREVSLEHEEQKLELKRQXiTELQLSLQERESQLTALQAAR 
AALESQLRQAKTELEETTAEAEEE IQALTAHRDEI QRKFDALRNSCTVI TDLEEQLNQLTEDNAELN 
NQNF YL SKQLDEAS GANDE I VQLRS EVDHLRRE I TER EMQL T S QKQTMEALKT TC TML EEQVMDL EA 
LNDELLEKERQWEAVraSVLGDEKSQFECRVRELQRMLOTEK 
EIIJ^QQALKEQKLKAESLSDKI^LEKKHAMLEMNARSLQQKLETERELKQ 

QKNH I FRL TQGLQEALDRADLLKTERS DL EYQLEN I QVL YSHEKVKMEGT I SQQTKL IDFLQAKMDQ 
PAKKKKVPLQ YNELKLALEKEKARCAELEEALQKTRX ETjRSAREEAAHRKATDHPHPSTPATARQQ I 
AMSAIVRSPEHQPSAMSLLAPPSSRRKESSTPEEFSRRLKERWHHNIPHRFOTGIJ^MRATKCAVCLD 
TVHFGRQASKCLECQVMCHPKCSTCLPATCGLPAEYATHFTEAFCRDKMNSPGLQTKEPSSSLHLEG 
W1^VPR1^RGQ(^WDRKYIVLEGSKVLIYDNEAREAGQRF^EFELCLPI^DVSIHGAVGASEI^ 
TAKADVP Y I LKMESH PHTTCWPGRTLYLL APSFPDKQRWVTALES WAGGRVSREKAEADAKLLGNS 
LLKLEGDDRLDMNC TL PFSDQWLVGTEEGL YALNVLKNSLTHVPGIGAVFQ I YI I KDLEKLLMI AG 
EERALCLVDVKKVKQSLAQSHLPAQ PDI S PN IFEAVKGCHLFGAGKIENGLC ICAAMPSKWILRYN 
ENLSKYC I RKE I ETSEPC S C I HFTNYS I LIGTNKF YE IDMKQ YTLEEFLDKNDHSLAPAVFAASSNS 
FPVSIVQWSAGQREEYLLCFHEFGVFVDSYGRRSRTDDLKWSRLPLAFAYREPYLFVTHFNSLEVI 
E IQAR S S AGTP ARAYLDI PNPRYLG PA I S SGAI YLASSYQDKLRVICCKGNLVKE SGTEHHRGPSTS 
RS S PNKRGP PT YNEHI TKRVAS SPAP PEGPSHPRBPS TPHRYREGRTELRRDKS PGRPLEREKSPGR 
MLS TRRER S PGRLFEDS SRGRLPAGAVRT PLSQVNKVRQHS EACVS VAEARSDLGN 



Sequence comparison of the above protein sequences yields the following sequence 
relationships shown in Table IB. 
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Table IB. Comparison of NO Via against NOVlb through NOVlg. 


Protein Sequence 


NOVla Residues/ 


Identities/ 


Match Residues 


Similarities for the Matched Region 


NOVlb 


1..615 


601/616 (97%) 




5..620 


602/616 (97%) 


NOVlc 


615..1442 


690/828 (83%) 




4..831 


691/828 (83%) 


NOVld 


615..1442 


690/843 (81%) 




4..846 


691/843 (81%) 


NOVle 


1436..2053 


618/618 (100%) 




3..620 


618/618 (100%) 


NOVlf 


1436..2053 


618/633 (97%) 




3.-635 


618/633 (97%) 


NOVlg 


1..2051 


1900/2051 (92%) 


1..2051 


1900/2051 (92%) 



Further analysis of the NOVla protein yielded the following properties shown in 
10 Table 1C. 
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Table 1C. Protein Sequence Properties NOVla 


PSort analysis: 


0.9800 probability located in nucleus; 0.3000 probability located in 
microbody (peroxisome); 0.1000 probability located in mitochondrial matrix 
space; 0.1000 probability located in lysosome (lumen) 


SignalP analysis: 


No Known Signal Sequence Predicted 



A search of the NOVla protein against the Geneseq database, a proprietary 
database that contains sequences published in patents and patent publication, yielded 
several homologous proteins shown in Table ID. 



Table ID, Geneseq Results for NOVla 


Geneseq 
Identifier 


Protein/Oi^anism/Length 
[Patent #, Date] 


NOVla 
Residues/ 
Match 
Residues 


luenuuesv 
Similarities for the 
Matched Region 


Expect 
Value 


AAU03501 


Human protein kinase #1 - 
Homo sapiens, 2053 aa. 
[WO200138503-A2, 
31-MAY-2001] 


1..2051 
1..2053 


2044/2053 (99%) 
2046/2053 (99%) 


0.0 


AAB43359 


Human ORFXORF3 123 
polypeptide sequence SEQ 
IDNO:6246-Homo 
sapiens, 1286 aa. 
[WO200058473-A2, 
05-OCT-2000] 


768..2053 
1..1286 


1286/1286 (100%) 
1286/1286 (100%) 


0.0 


ABB11117 


Human RHO/RAC effector 
homologue, SEQ ID 
NO: 1487 - Homo sapiens, 
999 aa. 

[WO200157188-A2, 
09-AUG-2001] 


968..1947 
1..980 


976/980(99%) 
976/980(99%) 


0.0 


AAU31443 


Novel human secreted 
protein #1934 -Homo 
sapiens, 910 aa. 
[WO200179449-A2, 
25-OCT-2001] 


1114..1982 
1..869 


867/869 (99%) 
867/869(99%) 


0.0 


AAE16261 


Human kinase PKIN-7 
protein - Homo sapiens, 497 
aa. [WO200196547-A2, 
20-DEC-2001] 


1..467 
1..468 


463/468 (98%) 
465/468 (98%) 


0.0 
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In a BLAST search of public sequence datbases, the NOV la protein was found to 
have homology to the proteins shown in the BLASTP data in Table IE. 
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Table IE. Public BLASTP Results for NOVla 


Protein 

Accession 

Number 


Protein/Organism/Length 


NOVla 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for the 
Matched Portion 


Expect 
Value 


088938 


Rho/rac-interacting citron 
kinase - Mus musculus 
(Mouse), 2055 aa. 


1..2053 
1..2055 


1974/2055 (96%) j 
2014/2055 (97%) 


0.0 


088528 


Citron-K kinase - Mus 
musculus (Mouse), 1641 aa 
(fragment). 


373..2053- 
1..1641 


1599/1683 (95%) 
1616/1683 (96%) 


0.0 


P49025 


Citron protein 
(Rho-interacting, 
serine/threonine kinase 21) - 
Mus musculus (Mouse), 
1597 aa. 


467..2053 
9..1597 


1563/1589 (98%) 
1578/1589 (98%) 


0.0 


Q9QX19 


Postsynaptic density protein 
- Rattus norvegicus (Rat), 
1618 aa. 


467..2053 
1-1618 


1556/1619(96%) 
1573/1619 (97%) 


0.0 


014578 


Citron protein 
(Rho-interacting, 
serine/threonine kinase 21) - 
Homo sapiens (Human), 
1286 aa (fragment). 


768..2053 
1..1286 


1286/1286 (100%) 
1286/1286(100%) 


0.0 



PFam analysis predicts that the NOVla protein contains the domains shown in the 
Table IF. 
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Table IF. Domain Analysis of NOVla 


Pfam Domain 


NOVla Match Region 


Identities/ 
Similarities 

for the Matched Region 


Expect Value 


pkinase 


97.359 


89/302 (29%) 
196/302(65%) 


2.7e-62 


pkinase_C 


360.389 


15/32(47%) 
24/32(75%) 


0.00023 
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DAG_PE-bind 


1389.. 1437 


14/51 (27%) 
34/51 (67%) 


6.1e-05 


PH 


1470..1589 


20/121 (17%) 
87/121 (72%) 


1.8e-ll 


CNH 


1618..1915 


107/378 (28%) 
289/378 (76%) 


1.5e-110 



Example 2. 

The NOV2 clone was analyzed, and the nucleotide and encoded polypeptide 
5 sequences are shown in Table 2A. 
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Table 2A. NOV2 Sequence Analysis 




SEQE>NO:15 |l238bp | 


NOV2a, 
CGI 17662-01 
DNA Sequence 


ATGGATGGATGGAGAAGGATGCCTCGCTGGGGACTGCTGCTGCTGCTCTGGGGGTCCTGTACCTTTGG 

TCTCCCGACAGACACCACCACCTTTAAACGGATCTTCCTCAAGAGAATGCCCTCAATCCGAGAAAGCC 

TGAAGGAACGAGGTGTGGACATGGCCAGGCTTGGTCCCGAGTGGAGCCAACCCATGAAGAGGCTGA 

C TTGGC AACACCAC CTCCTCCGTGATCCTC AC C AACTAC ATGG ACACCC AGT ACTATGGCGAGATTGG 

CATCGGCACCCCACCCCAGACCTTCAAAGTCGTCTTTGACACTGGTTCGTCCAATGTTTGGGTGCCCT 

CCTCCAAGTGCAGCCGTCTCTACACTGCCTGTGTGTATCACAAGCTCTTCGATGCTTCGGATTCCTCC 

AGCTACAAGCACAATGGAACAGAACTCACCCTCCGCTATTCAACAGGGACAGTCAGTGGCTTTCTCAG 

CCAGGACATCATCACCGTGGGTGGAATCACGGTGAOVCAGATGTTTGGAGAGGTCACGGAGATGCCCG 

CC TTAC CCTTC ATGCTGGCCGAGTTTGATGGGGTTGTGGGC ATGGGCTTC ATTGAACAGGCC ATTGGC 

AGGGTCACCCCTATCTTCGACAACATCATCTCCCAAGGGGTGCTAAAAGAGGACGTCTTCTCTTTCTA 

CTACAACAGAGATTCCGAGAATTCCCAATCGCTGGGAGGACAGATTGTGCTGGGAGGCAGCGAC^ 

AGCATTACGAAGGGAATTTCCACTATATCAACCTCATCA^ 

GGGGTGTCTGTGGGGTCATCCACCTTGCTCTGTGAAGACGGCTGCCTGGCATTGGTAGACACCGGTGC 
ATCCTACATCTCAGGTTCTACCAGCTCCATAGAGAAGCTCATGGAGGCCTTGGGAGCCAAGAAGAGGC 
TGTTTGATTATGTCGTGAAGTGTAACGAGGGCCCTACACTCCCCGACATCTCTTTCCACCTGGGAGGC 
AAAGAATACACGCTCACCAGCGCGGACTATGTATTTCAGGAATCCTACAGTAGTAAAAAGCTGTGCAC 
ACTGGCCATCCACGCCATGGATATCCCGCCACCCACTGGACCCACCTGGGCCCTGGGGGCCACCTTCA 
TC CGAAAGTTC TAC AC AGAGTTTGATCGGCGTAAC AACCGC ATTGGCTTCGCCTCGGCCCGCTGAGGC 
CCTCTGCCACCCAG 




ORF Start: ATG at 1 | |ORF Stop: TGA at 1219 






SEQ ID NO: 16 |406 aa MW at 45030.9kD 


NOV2a, 
CGI 17662-01 
Protein 
Sequence 


MIX3WRRMPRWGLLLLLWGSCTFGLPTDTTTFKRIFLK^ 

LGNTTS S VILTNYMDTQ YYGE IG IGTPPQTFKVVFDTGS SNVWPS SKCSRLYTACVYHKLFDASDS S 
S YXHNGTEL TLR YSTGTVSGFL SQDI I TVGG I TVTQMFGEVTEMPAL PFMLAEFDG WGMGF I EQAIG 
RVTPIFDNIISQGVLKEDVFSFYYNRDSENSQSIX^QIVLGGSDPQHYEGNFHYINLIKTGVWQIQ^ 
GVSVGSSTIiLCEDGCLALVOTGASYISGSTSSIEK^ 

KEYT LT S AD YVFQESYS S KKLC TLAI HAMD I PP PTG PTWALG ATF I RK FYTEFDRKNNRI GF AS AR 
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SEQIDNO:17 |911bp 




NOV2b, 
CGI 17662-02 
DNA Sequence 


ATGGATGGATGGAGAAGGATGCCTCGCTGGGGACTGCTGCTGCTGCTCTGGGGCTCCTGTACCTTTG 
GTCTCCCGACAGACACCACCACCTTTAAACGGATCTTCCTCAAGAGAATGCCCTCAATCCGAGAAAG 
CCTGAAGGAACGAGGTGTGGACATGGCCAGGCTTGGTCCCGAGTGGAGCCAACCCATGAAGAGGCTG 
ACACTTGGCAACACCACCTCCTCCGTGATCCTCACCAACTACATGGACACCCAGTACTATGGCGAGA 
TTGGCATCGGCACCCCACCCCAGACCTTCAAAGTCGTCTTTGACACTGGTTCGTCCAATGTTTGGGT 
GCCCTCCTCCAAGTGCAGCCGTCTCTACACTGCCTGTGTGTATCACAAGCTCTTCGATGCTTCGGAT 
TCCTCCAGCTACAAGCACAATGGAACAGAACTCACCCTCCGCTATTCAACAGGGACAGTCAGTGGCT 
TTCTCAGCC AGGACATCATC ACCGTGTCTGTGGGGTCATCCACC TTAC TCTGTGAAGACGGCTGCCT 
GGCATTGGTAGACACCGGTGCATCCTACATCTCAGGTTCTACCAGCTCCATAGAGAAGCTCATGGAG 
GCCTTGGGAGCCAAGAAGAGGCTGTTTGATTATGTCGTGAAGTGTAACGAGGGCCCTACACTCCCCG 
ACATCTCTTTCCACCTGGGAGGCAAAGAATACACGCTCACCAGCGCGGACTATGTATTTCAGGAATC 
CTACAGTAGTAAAAAGCTGTGCACACTGGCCATCCACGCCATGGATATCCCGCCACCCACTGGACCC 
ACCTGGGCCCTGGGGGCCACCTTCATCCGAAAGTTCTACACAGAGTTTGATCGGCGTAACAACCGCA 
TTGGC TTCGCC TCGG CCCGCTGAGGCCC TC TGCC ACCC AG 




ORF Start: ATG at 1 | |ORF Stop: TGA at 892 





SEQ ID NO: 18 j297 aa |MW at 33025.3kD 


NOV2b, 
CGI 17662-02 
Protein 
Sequence 


MIXSWRRMPRWGLLLLLWGSCTFGLPTDTTTFK^ 

TLGNTTSSVILTNYMDTQYYGEIGIGTPPQTFKVVFDTGSSim^PSSKCSRLyTACVYHKLFDASD 
SSSYKHNGTELTLRYSTGTVSGFLSQDIITVSVGSSTLLCEDGCLALVDTGASYISGSTSSIEKLME 
ALGAKKRLFDYVVKCNEG PTL PDI SFHLGGKEYTLTSADYVFQES YS SKKLC TLAIHAMDI PPP TG P 
TWALGATF IRKF YTEFDRRNNR IGFAS AR 



Sequence comparison of the above protein sequences yields the following sequence 
relationships shown in Table 2B. 



Table 2B. Comparison of NO V2a against NOV2b. 


Protein Sequence 


NOV2a Residues/ 
Match Residues 


Identities/ 

Similarities for the Matched Region 


NOV2b 


1..165 
1..165 


165/165 (100%) 
165/165 (100%) 



Further analysis of the NOV2a protein yielded the following properties shown in 
Table 2C 



Table 2C. Protein Sequence Properties NOV2a 


PSort analysis: 


0.3700 probability located in outside; 0.2541 probability located in microbody 
(peroxisome); 0.1900 probability located in lysosome (lumen); 0.1000 
probability located in endoplasmic reticulum (membrane) 


SignalP analysis: 


Cleavage site between residues 24 and 25 
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A search of the NOV2a protein against the Geneseq database, a proprietary 
database that contains sequences published in patents and patent publication, yielded 
several homologous proteins shown in Table 2D. 



Table 2D. Geneseq Results for NOV2a 


Geneseq 
Identifier 


Protein/Organism/Length 
[Patent #, Date] 


VSVJ V IXk. 

Residues/ 

Match 

Residues 


laeuiiucs/ 
Similarities for 
the Matched 
Region 


Expect 
Value 


AAW23244 


Human renin - Homo sapiens, 
4Uoaa. [WUy/zooo4-Al, 
14-AUG-1997] 


1..406 

1 AAA 


404/406 (99%) 


0.0 


AAP50135 


Sequence of pre-pro-renin - 
Homo sapiens, 406 aa. 
[EP135347-A, 


1..406 
1..406 


404/406 (99%) 
404/406(99%) 


0.0 


ABB11781 


Human renin homologue, 
SEQIDNO:2151 - Homo 
sapiens, 438 aa. 
[WO200157188-A2, 
09-AUG-2001] 


1..406 
31..438 


391/408 (95%) 
393/408 (95%) 


0.0 


AAU72879 


Human aspartyl protease 
partial protein sequence #4 - 
Homo sapiens, 412 aa. 
[WO200183782-A2, 
08-NOV-2001] 


24..405 
14..409 


169/400 (42%) 
246/400 (61%) 


le-90 


AAY93685 


Amino acid sequence of 
novel polypeptide PR0292 - 
Homo sapiens, 412 aa. 
[WO200037640-A2, 
29-JUN-2000] 


24..405 
14..409 


169/400(42%) 
246/400(61%) 


le-90 



In a BLAST search of public sequence datbases, the NOV2a protein was found to 
have homology to the proteins shown in the BLASTP data in Table 2E. 



Table 2E. Public BLASTP Results for NOV2a 
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Pi*aI pin 

Accession 
Number 


Protein/Organism/Length 


NOV2a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for 
the Matched 
Portion 


Expect 
Value 


P00797 


Renin precursor, renal (EC 
3.4.23.15) 

(Angiotensinogenase) - Homo 
sapiens (Human), 406 aa. 


1..406 
1..406 


405/406 (99%) 
405/406 (99%) 


0.0 


Q9TSZ1 


Preprorenin precursor (EC 
3.4.23.15) - Callithrix jacchus 
(Common marmoset), 400 aa. 


1..406 
1..400 


381/406 (93%) 
389/406 (94%) 


0.0 


P52115 


Renin precursor, renal (EC 
3.4.23.15) 

(Angiotensinogenase) - Ovis 
aries (Sheep), 400 aa. 


7..406 
1..400 


292/401 (72%) 
338/401 (83%) 


e-175 


Q15296 


Kidney mRNA fragment for 
renin (Aa 105-401) - Homo 
sapiens (Human), 300 aa 
(fragment). 


108..406 
1..300 


297/300 (99%) 
298/300 (99%) 


e-172 


P06281 


Renin precursor, renal (EC 
3.4.23.15) 

(Angiotensinogenase) - Mus 
musculus (Mouse), 402 aa. 


5..406 
4..402 


281/403 (69%) 
331/403(81%) 


e-167 



PFam analysis predicts that the NOV2a protein contains the domains shown in the 
Table 2F. 
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Table 2F. Domain Analysis of NOV2a 


Pfam Domain 


NO V2a Match Region 


Identities/ 
Similarities 

for the Matched Region 


Expect Value 


asp 


31..405 


174/428 (41%) 
339/428 (79%) 


4.1e-197 



Example 3. 

10 The NOV3 clone was analyzed, and the nucleotide and encoded polypeptide 

sequences are shown in Table 3A. 



[Table 3A. NOV3 Sequence Analysis 
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SEQIDNO:19 |2827bp tr 1 " Y H ***- i '™ tt ~" - 


NOV3a, 
CG118051-01 
DNA Sequence 


TGG CGATG C T AC TGTTT AAT TG C AGG AGGTGGGGGTGTG TGT ACC ATG T AC C AGGC5 C T ATT AG AAGC A 


AGAACXSAAGGAGGGAGGGCAGAGCGCCCTGCTGAGCAACAAAGGACTCCTGCAGCCTTCTCTGTCTGT 


CTCTTGGCACAGGCACATGGGGAGGCCTCCCGCAGGTGGGGGGCCACCAGTCCAGGGGTGGGAGCACT 


ACAGGGCACGAGTTGGTTTGGGAGCTGCCAGTCTCCTGGGAGGATCGCAGTCAGCAGAGCAGGGCTGA 


GGCCTGGGGGTAGGAGCAGAGCCTGCGCATCTGGAGGCAGCATGTCCAAGAAAGGGAGTGGAGGTGCA 


GCGAAGGACCCAGGGGCAGAGCCCACGCTGGGGAT<^ACCCCTTCGAGGACACACTGCGGCGGCTGCG 


TGAGGCCTTCAACTGAGGGCGCACGCGGCCGGCCGAGTTCCGGGCT<3CGCAGCTCCAGGGCCT<3GGCC 


ACTTCCTTCAAGAAAACAAGCAGCTTCTGCGCGACGTGCTGGCCCAGGACCTGCATAAGCCAGCTTTC 


G AGGCAG ACATATCTGAGCTCATCCTTTGCC AGAACGAGGTTGACTACGC TC TC AAGAACC TTCAGGC 


CTGGATGAAGGATGAACCACGGTCC^CGAACCTGTTCATGAAGCTGGACTCGGTCTTCATC 

AACCCTTTOGCCTGX3TCCTCATCATCGCACCCTGGAACTACCCATTGAACCTGACCCTCGTGCTCCTG 

GTGGGCACCCTCCCCGCAGGGAATT<XX3TGGTGCTGAAGCCGTCAGAAATCAGCCAGGGCACAGAGAA 

GGTCCTGGCTGAGGTGCTGCCCCAGTACCTGGACCAGAGCTGCTTTGCCGTGGTGCTGGGCGGACCCC 

AGGAGACAGGGCAGCTGCTAGAGCACAAGTTGGACTACATCTTCTTCACAGGGAGCCCTCGTGTGGGC 

aagattgtc atgactgc tgccacc aagcacc tgacgcctgtcaccc tggagctggggggcaagaaccc 
ct<x:tacgtggacgacaactgcgacccccagaccgtggcc^ccgcgtggcctggttctgctacttca 
atgccggccagacc tgcgtggccc ctgactacgtcc tgtgcagc cc cgagatgcaggagaggctgctg 
cccgccctgcagagcaccatcacccgtttctatg 

catcatcaaccagaaacagttccagcggctgcgggcattgctgggctgcggccgcgtgg 
gccagagcaacgagagcgatcgctacatcgcccccacggtgctggtggacgtgcagg 

GTGATGCAGGAGGAGATC TTCGGGCCC ATCU TGC V.C ATQ-G rGAAL GTGC AGAGCG iGGACGAGGCCAT 

caagttcatcaaccggcaggagaagcccctggccctgtacgccttctccaacagcagacaggttgtga 
accagatgctggagcggacgagcagcggcagctttggaggcaatgagggcttcacctacatatctctg 
ctgtccgtgccattcgggggagtcggccacagtgggatgggccggtaccacg 

CACCTTCTCCCACCACCGCACCTGCCTGCTCGCCCuC ivCGGCC I GGAbAAATTAAAGGAGATCCGCT 

acccaccctataccgactggaaccagcagctgttacgctggggcatgggctcccag 
ctgtqagcgtcccacccgcctccaacgggtcacacagagaaacctgagtctagccatgaggggcttat 


gctcccaactcacattgttcctccagaccgcaggctcccccagcctcaggttgctggagctgtcacat 


gactgcatcctgcctgccagggctgcaaagcaaggtcttgcttctatctgggggacgctgctcgagag 


aggccgagaggccgcagaacatcccaggtgtcctcactcaccccaccctccccaattccagccctttg 


ccctctcggtcagggttggccaggcccagtcacaggggcagtgtcaccctggaaaatacagtgccctg 




aggcacacgcgcacttccacctctgccccatcccaactgcaccagcactgcctcccccagggatc 


1a»AL.A1,\-H^ALAv- IvjGlV- luLALL H-U 1U 1 v>^s 1 l\.A^flv.tuWi»-LL HjL AL. 1 UAL L L AUAbU Ab 


CTCCATCC^CTGGGAAAACTGGGGTTTGCATCACTC^ 


GTCCCTTGACTTCTCTGAGCCTCAGTTTCCTTATGTGAAAGTTGCTGGAACCAAAATGGAGTCAC'rTA 


TGCCAAACTCTAATAAAATgGAGTCGGGGGGGCACATAGAAGCCCTCACACACACATGCCCGTAACAG 


GATTTATCACC AAGAC ACGCCTGCATGTAAGACC AGAC AC AGGGCG TATGGAAAAGC ACGTCC TCAAA 


GACTGTAGTATTCCAGATGAGCTGCAGATGCTTACCTACCACGGCCGTCTCCACCAGAAAACCATCGC 


caactcctgcgatcagcttgtgacttacaaaccttgtttaaaagctgcttacatggacttctgtcctt 


taaaacgttccccttggctgtggccctctgtctatgcctgggatccttccaagca 


taggaatcctctgctcctcccaaataaattcatctgttc 




ORF Start: ATG at 617 |0RF Stop: TGA at 1772 





SEQ ID NO: 20 |385 aa |MW at 42794.8kD 


N0V3a, 
CGI 18051-01 
Protein 
Sequence 


mkdeprsti^fmkldsvfiwkepfglvliiapwotplnltlvllvgtlpagncvvlkpseisqgtekv 
laevlpq yldq sc fawlggpqetgqllehkldyi fftg s prvgk i vmtaatkhltpvtlelggknpc 
yvddncdpc^anrvawfcyfnagqtcvapdyvlcspemqe^lpalqstitrfygddp^ 
inqkqfqrlrallg c gr vai ggq snesdry i ap tvl vd vqet e p vmq ee i fg p i l p i vnvq svdea ik 

FIimQEKPrja,YAFSNSRQVVNQMLERTSSGSFGGlJEGIWISLLSVPFGGVGHSGMGRYHGKFTFDT 
FSHHRTCLLAPSGLEKLKEIRYPPYTDWNQQLLRWGMGSQSCTLL 





SEQ ID NO: 21 |l586bp 1 


N0V3b, 
CG118051-02 
DNA Sequence 


CACGAGTTGGTTTGGGAGCTGCCAGTCTCCTGGGAGGATCGCAGTCAGCAGAGCAGGGCTGAGGCCT 


GGGGGTAGGAGCAGAGCCTGCGCATCTGGAGGCAGCATGTCCAAGAAAGGGAGTGGAGGTGCAGCGA 


AGGACCCAOTGGC^GAGCCCACGCT<5GGGAT<^ACCCCTTCGAGGACAC^CTGCX^CGGCTGCGTGA 


GgCCTTCAACTGAGGGCGCACGKTGGCCGGCCGAGTTCCGGGCTgCGCAGCTCCAGGGCCTGGGCCAC 


TTCCTTCAAGAAAACAAGCAGCTTCTGCGCGACGTGCTGGCCCAGGACCTGCATAAGCCAGCTTTCG 


AGGCAGACATATCTGAGCTCATCCTTTGCCAGAACGAGGTTGACTACGCTCTCAAG^ 


CTGGATGAAGGATCAACCACGGTCCACGAACCTGTTCATGAAGCTGGACTCG^ 
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GAACCCTTTGGCCTGGTCCTCATCATCGCACCCTGGi 

tggtgggcaccctccccgcagggaattgcgtggtgctgaagccgtcagaaaox:agccagggcacaga 
gaaggtcctggctgaggtgctgccccagtacctggaccagagctgctttgccgtggtgctgggcgga 
ccccaggagacagggcagctgctagagcacaagttggactacatcttcttcacagggagccctcgtg 
tgggcaagattgtcatgactgctgccaccaagcacctgacgcctgtcaccctggagctggggggcaa 
gaacccctgctacgtggacgacaactgcgacccccagaccgtggccaaccgcgtggcctggttctgc 
tacttcaatgccggccagacctgcgtggcccctgactacgtcctgtgcagccccgagatgcaggaga 
ggctgctgcccgccctgc agagc acc atc acccgtttc tatggcgacgacccc cagagc tccccaaa 
cctggg ccgc atc atc aac c ag aaacagt tc c agcgg ctgcgggcattgctgggc tgcggc cgcgtg 
gccattgggggccagagcaacgagagcgatcgctacatcgcccccacggtgctggtggacgtgcagg 
agacggagcctgtgatgcaggaggagatcttcgggcccatcctgcccatcgtgaacgtgcagagcgt 
ggacgaggccatc aagttcat c aaccggcagg agaagccc ctggccctgc acagtgggatgggccgg 
taccacggcaagttcaccttcgacaccttctcccaccaccgcacctgcctgctcgccccctccggcc 
tggagaaattaaaggag atc cg c taccc accctataccg actgg aacc agc agctgttacgctgggg 
catgggctcccagagctgcaccctcctgtg agcgtcccacccgcctccaacgggtcacacagagaaa 
cctgagtctagccatgaggggcttatgctcccaactcacattgttcctccagaccgcaggctccccc 



AGCCTCAGGTTOCTGGAGCTGTCACATGACTGCATCCTGCCTGCC 



ORF Start: ATG at 407 



ORF Stop: TGA at 1436 





SEQ ID NO: 22 |343 aa |MW at 38350.9kD 


NOV3b, 
CG118051-02 
Protein 
Sequence 


MKDEPRSTNLFMKLDSVFIV^EPFGLVLIIAPWNYPLl^ 

VLAEVLPQ YLDQ SC FAWLGG PQETGQLLEHKLDYIFFTGS PRVGKI VMTAATKHLTPVTLELGGKN 
PCYVDDNCDP^TVANRVAWFCYFNAGQTCVAPDYVLCSPEMQERLLPALQSTITRFYGDDPQSSP^ 
GRIINQKQFQRLRALI#GCGRVAIGGQSNESDRYIAPTVI#VDVQETEPWQEEIFGPILPIVWQSVD 
EAIKF INRQEKPIJU^HSGMGRYHGKFTFOTFSHHRTCLLAPSGLEKLKE IRYP PYTDWNQQLLRWGM 
GSQSCTLI* 





SEQ ID NO: 23 1 1791 bp | 


NOV3c, 
CG118051-03 
DNA Sequence 


TTAAGGAGAATCTTAAAGTGAGGGCTGAGGGACTCTCCTGATCCAGAGCTGAGGACTCTCCTGATCCA 


GAGC TG AGGGCTCTCCTGATGGACCCCTTCGAGGAC ACGCTGCGGCGGC TGCGTGAGGCCTTC AACTG 


AGGGCGCACGCGGCCGGCCGAGTTCCGGGCTGCGCAGCTCCAGGGCCTGGGCCACTTCCTTCAAGAAA 


ACAAGCAGCTTCTGCGCGACGTGCTGGCCCAGGACCTGCATAAGCCAGCTTTCGAGGCAGACATATCT 


GAGCTCATCCTTTGCCAGAACGAGGTTGACTACGCTCTCAAGAACCTTCAGGCCTGGATGAAGGATGA 


ACCACGGTCCACGAACCTGTTCATGAAGCTGGACTCGGTCTTCATCTGGAAGGAACCCTTTGGCCTGG 
TCCTCATC ATCGC ACCC TGGAAC TACCC ACTGAACCTGACCC TGGTGCTCC TGGTGGGCGCCCTCGCC 
GCAGGGAATTGCGTGGTGCTGAAGCCGTCAGAAATCAGCCAGGGCACAGAGAAGGTCCTGGCTGAGGT 
GC TGCCCC AGTACC TGGACCAGAGCTGC TTTGCCGTGGTGCTGGGCGGACCCCAGGAGACAGGGCAGC 
TGCTAGAGCACAAGTTGGACTACATCTTCTTCACAGGGAGCCCTCGTGTGGGCAAGATTGTCATGACT 
GCTGCCACCAAGCACCTGACGCCTGTCACCCTGGAGCTGGGGGGCAAGAACCCCTGCTACGTGGACGA 
CAACTGCGACCCCC AGACCGTGGCCAACCGCGTGGCC TGGTTCTGC TACTTCAATGCCGGC CAGACCT 
GCGTGGCCCCTGAC TACGTCCTGTGC AGC CCCGAGATGC AGGAGAGGCTGCTGCCCGCCCTGCAG AGC 
ACCATCACCCGTTTCTATGGCGACGACCCCCAGAGCTCCCCAAACCTGGGCCGCATCATCAACCAGAA 
ACAGTTCCAGCGGCTGCGGGCATTGCTGGGC1OTGGCCGCGTGGCCATTGGGGGCCAG 
GCGATCGCTACATCGCCCCCACGGTGCTGGTGGACGTGCAGGAGACGGAGCCTGTGATGCAGGAGGAG 
ATCTTCGGGCCCATCCTGCCCATCGTGAACGTGCAGAGCGTGGACGAGGCCATCAAGTTCATCAACCG 
GC AGGAGAAGCCCCTGGCC C TGTACGCCTTC TCC AAC AGCAGCCAGGTTGTG AACC AGATGCTGGAGC 

GGGGGAGTCGGCCACAGTGGGATGGGCCGGTACCACGGCAAGTTCACCTTCGACACCTTCTCCCACCA 
CCGCACCTGCCCGCTCGCCCCCTCCGGCCTGGAGAAATTAAAGGAGATCCGCTACCCACCCTATACCG 
ACTGGAAC(^GCAGCTGTTACGCTGGGGCATGGGCTCCCAGAGCTG(^CCCTCCTGTGAGCGTCCCAC 
CCGCCTCCAACGGGTCACACAGAGAAACCTGAGTCTAGCCATGAGGGGCTTATGCTCCCAACTCACAT 


TGTTCCTCCAGACCGCAGGCTCCCCCAGCCTCAGGTTGCTGGAGCTGTCACATGACTGCATCCTGCCT 


GCCAGGGCTGCAAAGCAAGGTCTTGCTTCTATCTGGGGGACGCTGCTCGAGAGAGGCCGAGAGGCCGC 


AGAACATGCCAGGTGTCCTCACTCACCCCACCCTCCCCAATTCCAGCCCTTTGCCCTCTCGGTCAGGG 


TTGACCAGGCCAAGGGCTAGCAT 




ORF Start: ATG at 330 | |ORF Stop: TGA at 1485 
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SEQ ID NO: 24 |385 aa |MW at 42653.5kD 


NOV3c, 

CGI 18051-03 ; 

Protein 

Sequence 


MKDEPRS TNLFMKLDSVF I WKEPFGLVL 1 1 APWNYPLNLTLVLLVGALAAGNC WLK PS EI SQGTEKV 
LAEVL PQYLDQ SCFAWLGG PQETGQLL EHKLD YI PFTGS PRVGKIVMTAATKHLTPVTLELGGKNPC 
YVDDNCDPQTVANRVAWFC YFNAGQTCVAPDYVLC S PEMQERLLPAIiQST ITRFYGDDPQ S S PNLGR I 
INQKQFQRLRALLGCGRVAIGGQSNESDRYIAPTVLVDVQETEPVMQEEIFGPILPIVNVQSVDEAIK 
FIimQEKPI^YAFSNSSQVVNQMLERTSSGSFGGNEGFTYISLLSVPFGOTGHSGMGRYHGKFTFOT 
FSHHRTCPLAP SGLEKLKE IR YPPYTDWNQQLLRWGMGSQSCTLL 



5 

Sequence comparison of the above protein sequences yields the following sequence 
relationships shown in Table 3B. 



Table 3B. Comparison of NOV3a against NOV3b and NOV3c. 


Protein Sequence 


NOV3a Residues/ 
Match Residues 


Identities/ 

Similarities for the Matched Region 


NOV3b 


1..385 
1..343 


331/385 (85%) 
331/385 (85%) 


NOV3c 


1..385 
1..385 


363/385 (94%) 
363/385 (94%) 
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Further analysis of the NOV3a protein yielded the following properties shown in 
Table 3C. 
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Table 3C. Protein Sequence Properties NOV3a 


PSort analysis: 


0.7900 probability located in plasma membrane; 0.3000 probability located in 
Golgi body; 0.2000 probability located in endoplasmic reticulum (membrane); 
0.1743 probability located in microbody (peroxisome) 


SignalP analysis: 


Cleavage site between residues 54 and 55 



A search of the NOV3a protein against the Geneseq database, a proprietary 
database that contains sequences published in patents and patent publication, yielded 
20 several homologous proteins shown in Table 3D. 



Table 3D. Geneseq Results for NOV3a 


Geneseq 
Identifier 


Protein/Organism/Length 
[Patent*, Date] 


NOV3a 
Residues/ 


Identities/ 
Similarities far 


Expect 
Value 



115 



WO 03/029424 



PCT/US02/31373 







Pit 

lVlatch 

Residues 


the Matched 
Region 




AAB58156 


Lung cancer associated 
polypeptide sequence SEQ ED 
494 - Homo sapiens, 430 aa. 
[WO200055180-A2, 

Z I -oJ2r -ZUUUJ 


1..353 
62..414 


325/353 (92%) 
337/353 (95%) 


0.0 


ABB66868 


Drosophila melanogaster 
polypeptide SEQ ID NO 
27396 -Drosophila 
melanogaster, 561 aa. . 
[WO200171042-A2, 
z7-oiir-ZUUlJ 


14..309 
95..390 


158/2% (53%) 
212/296(71%) 


3e-94 


ABB65492 


Drosophila melanogaster 
polypeptide SEQ ID NO 
23268 -Drosophila 
melanogaster, 561 aa. 
[WO200171042-A2, 
27-orir-ZUUlJ 


14..309 
95..390 


158/296 (53%) 
212/296 (71%) 


3e-94 


ABP39856 


Staphylococcus epidennidis 
ORF amino acid sequence 
SEQ ID NO:4701 - 
Staphylococcus epidermidis, 
464 aa.[US6380370-Bl, 
3O-APR-20O2] 


2..365 
88..451 


157/366 (42%) 
235/366 (63%) 


le-85 


AAG82730 


S. epidermidis open reading 
frame protein sequence SEQ 
ID NO:2554 - Staphylococcus 
epidermidis, 459 aa. 
[WO200134809-A2, 
17-MAY-2001] 


2..36S 
83..446 


157/366 (42%) 
235/366 (63%) 


le-85 



In a BLAST search of public sequence datbases, the NOV3a protein was found to 
have homology to the proteins shown in the BLASTP data in Table 3E. 
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Table 3E. Public BLASTP Results for NOV3a 




Protein 

Accession 

Number 


Protein/Organism/Length 


NOV3a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for the 
Matched Portion 


Expect 
Value 


P48448 


Aldehyde dehydrogenase 8 
(EC 1.2.1.5) - Homo sapiens 
(Human), 385 aa. 


1..385 
1..385 


385/385 (100%) 
385/385 (100%) 


0.0 



116 



WO 03/029424 



PCT/US02/31373 



BAC03897 


CDNA FU35145 fis, clone 
PLACE6009853, highly 
similar to ALDEHYDE 
DEHYDROGENASE 8 (EC 
1 .2. 1.5) - Homo sapiens 
(Human), 385 aa. 


L.385 *" 
1..385 


381/385 (98%) 




P43353 


Aldehyde dehydrogenase 7 
(EC 1.2.1.5) - Homo sapiens 
(Human), 468 aa. 


1..385 
82..468 


321/387 (82%) 
345/387 (88%) 


0.0 


AAH33099 


Similar to aldehyde 
dehydrogenase 3 family, 
member Bl - Homo sapiens 
(Human), 431 aa. 


13..385 
57..431 


315/375 (84%) 
339/375 (90%) 


0.0 


Q8VHW0 


Aldehyde dehydrogenase 
ALDH3B1 (EC 1.2-1.3) - 
Mus musculus (Mouse), 449 
aa (fragment). i 


1..385 
63..449 


295/387 (76%) 
336/387 (86%) 


e-174 



PFam analysis predicts that the NOV3a protein contains the domains shown in the 
Table 3F. 
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Table 3F. Domain Analysis of NOV3a 


Pfam Domain 


NOV3a Match Region 


Identities/ 
Similarities 

for the Matched Region 


Expect Value 


aldedh 


1..351 


129/492 (26%) 
299/492 (61%) 


l.ie-103 



Example 4. 

10 The NOV4 clone was analyzed, and the nucleotide and encoded polypeptide 

sequences are shown in Table 4A. 



Table 4A. NOV4 Sequence Analysis 




SEQIDNO:25 jl636bp 


NOV4a, 
CG120277-01 
DNA Sequence 


CCAGGAGCCCCAGTTACCGGGAGyVGGCTGTGTCAAAGGCGCCATQAC?r A AC; A TP AnrnACuyorsnrz a a 


GCGCGCCCGCGCCGCCTTCAGCTCGGGCAGGACCCGTCCGCTGCAGTTCCGATTCCAGCAGCTGGAGG 
CGCTGCAGCGCCTGATCCAGGAGCAGGAGCAGGAGCTGGTGGGCGCGCTGGCCGCAGACCTGCACAAG 
AATGAATGGAACGCCTACTATGAGGAGGTGGTGTACGTCCTAGAGGAGATCGAGTACATGATCCAGAA 
GCTCCCTGAGTGGGCCGCGGATGAGCCCGTGGAGAAGACGCCCCAGACTCAGCAGGACGAGCTCTACA 
TCCACTCGGAGCCACTGGGCGTGGTCCTCGTCATTGGCACCTGGAACTACCCCTTCAACCTCACCATC 
CAGCCCATGGTGGGCGCCATCGCTGCAGGGAACGCAGTGGTCCTCAAGCCCTCGGAGCTGAGTGAGAA 
CATGGCGAGCCTGCTGGCTACCATCATCCCCCAGTACCTGGACAAGGATCTGTACCCAGTAATCAATG 



117 



WO 03/029424 



PCT/US02/31373 





GGGGTGTCCCTGAGACCACGGAGCTGCTCAAG^ 

GGGGTGGGGAAGATCATCATGACGGCTGCTGCCAAGCACCTGACCCCTGTCACGCTGGAGCTGGGAGG 
GAAGAGTCCCTGCTACGTGGACAAGAACTGTGACCTGGACGTGGCCTGCCGACGCATCGCCTGGGGGA 
AATTCATGAACAGTGGCCAGACCTGCGTGGCCCCAGACTACATCCTCTGTGACCCCTCGATCCAGAAC 
CAAATTGTGGAGAAGCTCAAGAAGTCACTGAAAGAGTTCTACGGGGAAGATGCTAAGAAATCCCGGGA 
CTATGGAAGAATCATTAGTGCCCGGCACTTCCAGAGGGTGATGGGCCTGATTGAGGGCCAGAAGGTGG 
CTTATGGGGGCACCGGGGATGCCGCCACTCGCTACATAGCCCCCACCATCCTCACGGACGTGGACCCC 
CAGTCCCCGGTGATGCAAGAGGAGATCTTCGGGCCTGTGCTGCCCATCGTGTGCGTGCGCAGCCTGGA 
GGAGGCCATCCAGTTCATCAACCAGCGTGAGAAGCCCCTGGCCCTCTACATGTTCTCCAGCAACGACA 
AGGTGATTAAGAAGATGATTGCAGAGACATCCAGTGGTGGGGTGGCGGCCAACGATGTCATCGTCCAC 
ATCACCTTGCACTCTCTGCCCTTCGGGGGCGTGGGGAACAGCGGCATGGGATCCTACCATGGCAAGAA 
GAGCTTCGAGACTTTCTCTCACCGCCGCTCTTGCCTGGTGAGGCCTCTGATGMIX^TGAAGGCCTGA 
AGGTCAGATACCCCCCGAGCCCGGCCAAGATGACCCAGCACTGAGGAGGGGTTGCTCCGCCTGGCCTG 
GCCATACTGTGTCCCATCGGAGTGCGGACCACCCTCACTGGCTCTCCTGGCCCTGGAGAATCGCTCCT 


1 


GCAGCCCCAGCCCAGCCCCACTCCTCTGCTGACCTGCTGACCTGTGCACACCCCACTCCCACATGGGC 


CCAGGCCTCACCATTCCAAGTCTCCACCCCTTTCTAGACCAATAAAGAGACAAATACAATTTTCTAAC 


TCGG 




ORF Start: ATG at 43 


|ORFStop: TGA at 1402 





SEQE>NO:26 


453 aa jMW at 50412.5kD 


NOV4a, 
CG120277-01 
Protein 
Sequence 


MS KI S EAVKRARAAFS SGRTR PL QFR FQQIi EALQRL IQEQEQELVGALAADLHKNEWNAYYEEVVYVL 
EEIEYMIQKLPEWAADEPVEKTPQTQQDELYIHSEPLGV\^VI^ 

LKPSELSENMASLLATIIPQYLDKDLYPVINGGVPETTELLKERFDHILYTGSTGVGKIIMTAAAKHL 
TPVTLELGGKSPCYVDKNCDLOTACRR IAWGKFMNSGQTCVAPDYILCDPS IQNQIVEKLKKSLKEFY 
GEDAKKSRDYGRI I SARHFQRVMGLI EGQKVAYGGTGDAATRYIAPTILTDVDPQSPVMQEEIFGPVL 
PIVCVRSLEEAIQFINQREKPLALYMFSSNDKVIKKMIAETSSGGVAANDVIVHITLHSLPFGGVGNS 
GMGSYHGKKSFETFSHRRSCLVRPLMNDEGLKVRYPPSPAKMTQH 





SEQIDNO:27 |l554 bp \ 


NOV4b, 
CG120277-02 
DNA Sequence 


GAGCCCCAGTTACCGGGAGAGGCTGTGTCAAAGGCGCCATGAGCAAGATCAGCGAGGCCGTGAAGCG 


CGCCCGCGCCGCCTTCAGCTCGGGCAGGACCCGTCCGCTGCAGTTCCGGATCCAGCAGCTGGAGGCG 
CTGCAGCGCCTGATCCAGGAGCAGGAGCAGGAGCTGGTGGGCGCGCTGGCCGCAGACCTGCACAAGA 
ATG AATGG AACGC C TAC TATG AGG AGG TGG TG TAC G TCC T AG AGGAGATCG AGT AC ATG ATCC AGAA 
GCTCCCTGAGTGGGCCGCGGATGAGCCCGTGGAGAAGACGCCCCAGACTCAGCAGGACGAGCTCTAC 
ATCCACTCGGAGCCACTGGGCGTGGTCC^CGTCATTGGCACCTGGAACTACCCCTTCAACCTCACCA 
TCCAGCCCATGGTGGGCGCCATCGCTGCAGGGAACGCAGTGGTCCTCAAGCCCTCGGAGCTGAGTGA 
GAACATGGCGAGCC TGCTGGC TACC ATC ATCCCCCAGTACCTGGACAAGGATC TGTACCCAGTAATC 
AATGGGGGTGTCCCTGAGACCACGGAGCTGCTCAAGGAGAGGTTCGACCATATCCTGTACACGGGCA 
GCACGGGGGTGGGGAAGATCATCATGACGGCTGCTGCCAAGCACCTGACCCCTGTCACGCTGGAGCT 
GGGAGGGAAGAGTCCCTGCTACGTGGACAAGAACTGTGACCTGGACGTGGCCTGCCGACGCATCGCC 
TGGGGGAAATTCATGAACAGTGGC CAGACCTGCGTGGCCC CAGACTACATCCTCTGTGACCCCTCGA 
TC CAGAACC AAATTGTGGAGAAGCTC AAG AAGTCACTGAAAG AGTTCTACGGGGAAGATGC TAAGAA 
ATCCCGGGACTATGGAAGAATCATTAGTGCCCGGCACTTCCAGAGGGTGATGGGCCTGATTGAGGGC 
CAGAAGGTGGCTTATGGGGGCACCGGGGATGCCGCCACTCGCTACATAGCCCCCACCATCCTCACGG 
ACGTGGACCCCCAGTCCCCGGTGATGCAAGAGGAGATCTTCGGGCCTGTGCTGCCCATCGTGTGCGT 
GCGCAGCCTGGAGGAGGCCATCCAGTTCATCAACCAGCGTGAGAAGCCCCTGGCCCTCTACATGTTC 
TCCAGCAACGACAAGGTGATTAAGAAGATGATTGCAGAGACATCCAGTGGTGGGGTGGCGGCCAACG 
ATGTCATCGTCCACATCACCTTGCACTC TC TGCCC TTCGGGGGCGTGGGGAACAGCGGCATGGTGAG 
GCCTCTGATGAATGATGAAGGCCTGAAGGTCAGATACCCCCCGAGCCCGGCCAAGATGACCCAGCAC 
TGAGGAGGGGTTGCTCCGTCTGGCCTGGCCATACTGTGTCCCATCGGAGTGCGGACCACCCTCACTG 


GCTCTCCTGGCCCTGGGAGAATCGCTCCTGCAGCCCCAGCCCAGCCCCACTCCTCTGCTGACCTGCT 


GACCTGTGCACACCCCACTCCCACATGGGCCCAGGCCTCACCATTCCAAGTCTCCACCCCTTTCTAG 


ACCAATAAAGAGA . 




ORF Start: ATG at 39 j |ORF Stop: TGA at 1341 



118 



WO 03/029424 



PCT/US02/31373 





SEQ ID NO: 28 |434 aa |mW at 48169 OkD 


NOV4b, 
CG120277-02 
Protein 
Sequence 


Mjp"SEAVKRARAAFSSGRTRPLQFRIQQLEAI/2^ 

KHMPVTLEIX^KSPCYTOKNCDLDVACRRIAWGKF^SGQTCVAPDYILCDPSI^QTVEKU^^ 



5 

Sequence comparison of the above protein sequences yields the following sequence 
relationships shown in Table 4B. 



Table 4B. Comparison of NOV4a against NOV4b. 


Protein Sequence 


NOV4a Residues/ 
Match Residues 


Identities/ 

Similarities for the Matched Region 


NOV4b 


1..453 
1..434 


401/453 (88%) 
401/453 (88%) 



Further analysis of the NOV4a protein yielded the following properties shown i: 
Table 4C. 
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Table 4C. Protein 


Sequence Properties NOV4a 


PSort analysis: 


0.7636 probability located in mitochondrial matrix space; 0.4422 probability 
located in mitochondrial inner membrane; 0.4422 probability located in 
mitochondrial intermembrane space; 0.4422 probability located in 
mitochondrial outer membrane 


SignalP analysis: 


No Known Signal Sequence Predicted — i 



A search of the NOV4a protein against the Geneseq database, a proprietary 
database that contains sequences published in patents and patent publication, yielded 
20 several homologous proteins shown in Table 4D. 



Table 4D. Genes 


eq Results for NO V4a j 


Geneseq 
Identifier 


Protein/Organism/Length 
[Patent #, Date] 


NOV4a 
Residues/ 

Match 


Identities/ 
Similarities for 

thft Matched 


Expect 
Value 



119 



WO 03/029424 PCTYUS02/31373 







Residues 


Ijj. It ..' H J S Old 

Region 


~J A -ll> . 


AAB58156 


Lung cancer associated 
polypeptide sequence SEQ ID 
494 - Homo sapiens, 430 aa. 
[WO200055180-A2, 
21-SEP-2000] 


48..431 
28..411 


208/384 (54%) 
277/384 (71%) 


e-124 


ABB66868 


Drosophila melanogaster 
polypeptide SEQ ID NO 
27396 -Drosophila 
melanogaster, 561 aa. 
[WO200171042-A2, 
27-SEP-2001] 


1..394 
1..394 


199/394 (50%) 
270/394 (68%) 


e-115 


ABB65492 


Drosophila melanogaster 
polypeptide SEQ ID NO 
23268 -Drosophila 
melanogaster, 561 aa. 
[WO200171042-A2, 
27-SEP-2001] 


1..394 
1..394 


199/394 (50%) 
270/394 (68%) 


e-115 


AAG21988 


Arabidopsis thaliana protein 
fragment SEQ ID NO: 24747 
- Arabidopsis thaliana, 484 
aa. [EP1033405-A2, 
06-SEP-2000] 


2..445 
10..456 


210/449 (46%) 
288/449 (63%) 


e-112 


AAG11789 


Arabidopsis thaliana protein 
fragment SEQ ID NO: 10644 
- Arabidopsis thaliana, 484 
aa. [EP1033405-A2, 
06-SEP-2000] 


2..445 
10..456 


210/449 (46%) 
288/449 (63%) 


e-112 



In a BLAST search of public sequence datbases, the NOV4a protein was found to 
have homology to the proteins shown in the BLASTP data in Table 4E. 
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Table 4E. Public BLASTP Results for NOV4a 


Protein 

Accession 

Number 


Protein/Organism/Length 


NOV4a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for the 
Matched Portion 


Expect 
Value 


P30838 


Aldehyde dehydrogenase, 
dimeric NADP-preferring (EC 
1.2.1.5) (ALDH class 3) 
(ALDHHI) - Homo sapiens 
(Human), 453 aa. 


1..453 
1..453 


453/453 (100%) 
453/453 (100%) 


0.0 



120 



WO 03/029424 



PCT/US02/31373 



Q9BT37 


Aldehvde dehydrogenase ^ 
(Aldehyde dehydrogenase 3 
family, member Al) - Homo 
sapiens (Human), 453 aa. 


R 

1 453 

1..453 


t HA& MRP*' 

452/453 (99%) 


u.u 


A42584 


aldehyde dehydrogenase 
(NAD(P)+) (EC 1.2.1.5) 3- 
human, 453 aa. 


1..453 
1..453 


450/453 < , 99%1 
451/453 (99%) 


0 0 
u,u 


A30149 


aldehyde dehydrogenase 
(NADP+) (EC 1.2.1.4)3, 
tumor-associated [similarity] - 
rat, 453 aa. 


1..453 
1..453 


370/453 (81%) 
415/453 (90%) 


0.0 


PI 1883 


Aldehyde dehydrogenase, 
dimeric NADP-preferring (EC 
1.2.1.5) (ALDH class 3) 
(Tumor-associated aldehyde 
dehydrogenase) 
(HTC-ALDH) - Rattus 
norvegicus (Rat), 452 aa. 


2.-453 
1..452 


369/452 (81%) 
414/452 (90%) 


0.0 



PFam analysis predicts that the NOV4a protein contains the domains shown in the 
Table 4F. 
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Table 4F. Domain Analysis of NOV4a 


Pfam Domain 


NOV4a Match Region 


Identities/ 
Similarities 

for the Matched Region 


Expect Value 


aldedh 


1..432 


182/492 (37%) 
401/492 (82%) 


7.4e-206 



Example 5. 

10 The NOV5 clone was analyzed, and the nucleotide and encoded polypeptide 

sequences are shown in Table 5A. 



Table 5A. NOV5 Sequence Analysis 




SEQIDNO:29 |2316bp 


NOV5a, 
CG140468-01 
DNA Sequence 


GCCACGAAGGCCACAGACGCCTTCCCCCTTGGACTCTCATTCCCTTTTCCACGGAGCCCCGCGCTTTC 


GTGAGCCCCCTCGAGGAACCTGGTCTCCGCATCCAGTTACCACCTCCTGCCTCAGAGGCCATCTGAGC 


CCTTCGCACCTCGCCCCTCAGTCCCCCCTTGCCCCCCCGCGGAGATCGCCTCGCTCCCTCCCGCCCCC 


CCATCATCCCTTCCCTCGCAGTTCCCCTGTCCTGAGGGGAGCCCCGCCACGGCAGCGACAGCGGGCAG 


GAGGGAGAAAGTGAAGGTTGGGCGACACTTGGCCTCACTCCCGGCTAGGCGCACCCACGGGGAGGAGA 


GGAGGAGCCGAGAGAGCTGAGCAGCGCGGAAGTAGCTGCTGCTGGTGGTGACAATGTCAAATAACGGC 


CT AG AC ATTCAAGACAAAC CC C C AGC C CC TCCG ATG AG AAAT ACC AGCACTATGATTGGAGTCGG CAG 



121 



WO 03/029424 



PCT/US02/31373 





CAAAGATGCTGGAACCCTAAACCATGGTTC ' 

AGGACCGATTTTACCGATCCATTTTACCTGGAGATAAAACAAATAAAAAGAAAGAGAAAGAGCGGCCA 

GAGATTTCTCTCCCTTCAGATTTTGAACACACAATTCATGTCGGTTTTGATGCTGTCACAGG 

TACGGGAATGCCAGAGCAGTGGGCCCGCTTGCTTCAGACATCAAATATCACTAAGTCGGAGCAGAAGA 

AAAACCCGCAGGCTGTTCTGGATGTGTTGGAGTTTTACAACTCGAAGAAGACATCCAACAGCCAGA^ 

TACATGAGCTTTACAGATAAGTCAGCTGAGGATTACAATTCTTCTAATGCCTTGAATGTGAAGGCTGT 

GTCTGAGACTCCTGCAGTGCCACCAGTTTCAGAAGATGAGGATGATGATGATGATGATGCTACCCCAC 

CAC CAGTGATTGCTCCACGCCCAGAGCACACAAAATCTGTATACACACGGTCTGTGATTG AACC AC TT 

CCTGTCACTCCAACTCGGGACGTGGCTACATCTCCCATTTCACCTACTGAAAATAACACCACTCCACC 

AGATGCTTTGACCCGGAATACTGAGAAGCAGAAGAAGAAGCCTAAAATGTCTGATGAGGAGATCTTGG 

AGAAATTACGAAGCATAGTGAGTGTGGGCGATCCTAAGAAGAAATATACACGGTTTGAGAAGATTGGA 

CAAGGTGCTTCAGGCACCGTGTACACAGCAATGGATGTGGCCACAGGACAGGAGGTGGCCATTAAGCA 

GATGAATCTTCAGCAGCAGCCCAAGAAAGAGCTGATTATTAATGAGATCCTGGTCATGAGGGAAAACA 
arsaarTT" a a aparpTi^TY^a A^^ar^TY^apa^TT^appovv^nvsp/^af^ * 

m*p T , TYir2r' r TV!r2 2v nnc rpccfffm a r» a^2 a owprs/^iwia papa a a r"T>rnpp a nvifi a mr» a a dnr*n a a a mmr r» * r»o 
1AL1 1 uuL IbuAouL I 1 iTjACA^AXVjXl^loACAVjAAAt.1 XwCAX\?uAX^AA(j{j<_L.AAAXTGUAGC 

Xva -L Vj 1 vj^^u IvjAVj 1\3 X W X uLAooL l\~X OVjr AVj X X\» X luCn lit unALLAuu 1LA1 1 LALAoAuALA X V*. A 

AJoAO X Vj A\_ AA 1 A A 1 C X vj J. i«wAA XoljA X uvjC XV Xvj XuAAvL. X AAL 1\jAL X X X\j<j A X XV i\j X QsC AC AG 

ATAACCCCAGAGCAGAGCAAACGGAGCACCATGGTAGGAACCCCATACTGGATGGCACCAGAGGTTGT 
GACACGAAAGGCCTATGGGCCCAAGGTTGACATCTGGTCCCTGGGCATCATGGCCATCGAAATGATTG 
AAGGGGAGCCTCCATACCTCAATGAAAACCCTCTGAGAGCCTTGTACCTCATTGCCACCAATGGGACC 
CC^GAACTTCAGAACCCAGAGAAGCTGTCAGCTATCTTCCGGGACTTTCTGAACCGCTGTCTCGATAT 
GGATGTGGAGAAGAGAGGTTCAGCTAAAGAGCTGCTACAGCATCAATTCCTGAAGATTGCCAAGCCCC 
TCTCC AGCCTCAC TCCaCTGATTGCTGCAGCTAAGGAGGCAACAAAGAAC AATCACTAAAACCAC AC T 
CACCCCAGCCTCATTGTGCCAAGCTCTGTGAGATAAATGCACATTTCAGAAATTCCAACTCCTGATGC 


CCTCTTCTCCTTGCCTTGCTTCTCCCATTTCCTGATCTAGCACTCCTCAAGACTTTGATCCTTGGAAA 


CCGTGTGTCCAGCATTGAAGAGAACTGCAACTGAATGACTAATCAGATGATGGCCATTTCTAAATAAG 


GAATTTCCTCCCaATTCATGGATATGAG^TGGTTTATGATTaAGGGTTTATATAAATAAATGT 


AGTC 




ORF Start: ATG at 394 


|ORF Stop: TAA at 2029 





SEQ ID NO: 30 J545 aa |MW at 60660.3kD 


NOV5a, 
CG140468-01 
Protein 
Sequence 


MS1WGLDIQDKPPAPPMROTSTMIGVGSKEAGTL 
EKERPEISLPSDFEHTIHVTCFDAVTGEFTGMPEQWaRL^ 

SNSQKYMS FTDKSAEDYNSSNALNVKAVSETPAVPPVS EDEDDDDDDATPPPVI APRPEHTKSVYTRS 
VIEPLPVTPTRDVATSPISPTENin'TPPDALTRNTEKQKKKPKMSDEEILEKx^SIVSVGD 
FEKIGQGASGTWTAMDVATGQEVaIKQMNLQQQPKKEL I INEILVMRENKNPNI VNYLDSYLVGDEL 
WVVMEYLAGGSLTDVVTETC^EGQIAAVaiECLQAL^ 
GFCAQITPEQSKRSTWGTPYWMAPEVVTRKaYGP^ 

ATNGTPELQNPEKLSAIFRDFLTOCLDMDVEKRGSAKELLQHQFXKIAKPLSSLTPLIAAaKEATKNN 
H 





SEQ ID NO: 31 j957bp f 


N0V5b, 
CG140468-02 
DNA Sequence 


GACAATGTCAAATAACGGC CTAGAC ATTCAAGACAAACCC CC AGCCCCTCCGATGAG 
ACTATGATTGGAGCCGGCAGCJ^GATGCTGGAACCCTAAACC^^ 

ACCCAGAGGAGAAGAAAaAGMGGACCGATTTTACCGATCCATTTTACCTGGAGATaaAACaaATaA 

AaAGAAAGAGAAAGAGCGGCCAGAGATTTCTCTCCCTTCAGATTTTGAACACACAATTC^ 

TTTGATGCTGTCACAGGGGaGTTTACGGGAATGCCJ^GAGC^ 

ATATCACTAAGTCGGAGCAGAAGAaAAACCCGCAGGCTGTTCTGGATGTGTTGGAGTTTTACAACTC 
GAAGaAGACATCCAACAGCCAGAAATACJVTGAGCTTTACJ^GATAAGTCAGCTGAGGATTACAATTCT 
TCTAATGCCTTGAATGTGAAGGCTGTGTCTGAGACTCCTGCAGT^^ 
ATGATGATGATGATGATGCTACCCCJVCCJVCCAGTGATTGCT^ 

ATAC^CACGGTCTGTGATTGAACCACTTCCTGTC^CTCCAACTCGGGACGTGGCTACATCTCCCATT 
TC ACC TACTG AAAAT AACACCAC TC CACCJVGATGCTTTG ACCCGG AAT AC TGAGAAG C AG AAG AAG A 
AGCCTAAAATG TC TGATGAGGAGATCT TGGAG AAATT ACG AAG CATAG TG AG TGTGGG CG ATCCTAA 
GAAGAAATATACACGGTTTGAGaAGATTGCCJ^GCCCCTCTCCAGCCTCACTCCACTGATTGCTGCA 
GCTAAGGAGGCAACAAAGAACAATCACTaAAACC^CACTCACCCCAGCCTCATTGTGCCAAGCCTTC 


TGTGAGATAAATGCACATT 




ORF Start: ATG at 5 | |ORF Stop: TAA at 899 



122 



WO 03/029424 



PCT7US02/31373 





SEQIDNO:32 |298aa 


MWat32989.7kD 


NOVSb, 
CG140468-02 
Protein 
Sequence 


MSNNGLDIQDKPPAPPKRNTSTMIGAGSKDAGTLNHGSKPLPPNPEEKKKKDRFYRSILP 
KEKERPE I SL PSDFEHT I HVGFDAVTGEFTGMPEQWAIUjLQTSNI TKSEQKKNPQAVLDVLEFYNSK 
KTSNSQKYMSFTDKS AED YNS SNALNVKAVS ETPAVPFVSEDEDDDDDDATPPPVIAPRPEHTKSVY 
TRSVI EPL PVTPTRI^ATSP I SPTENNTTPPDALTKNTEKQKKKPKMSDEE ILEKLRS I VSVGDPKK 
KYTRFEKIAKPLSSLTPLIAAAKEATKNNH 



5 

Sequence comparison of the above protein sequences yields the following sequence 
relationships shown in Table 5B. 
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Table 5B. Comparison of NOVSa against NOVSb. 


Protein Sequence 


NOVSa Residues/ 
Match Residues 


Identities/ 

Similarities for the Matched Region 


NOVSb 


1..281 
1..281 


238/281 (84%) 
239/281 (84%) 



Further analysis of the NOV5a protein yielded the following properties shown in 
Table 5C. 
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Table SC. Protein Sequence Properties NOVSa 


PSort analysis: 


0.7000 probability located in nucleus; 0.3000 probability located in 
microbody (peroxisome); 0.1000 probability located in mitochondrial matrix 
space; 0.1000 probability located in lysosome (lumen) 


SignalP analysis: 


No Known Signal Sequence Predicted 



A search of the NOV5a protein against the Geneseq database, a proprietary 
20 database that contains sequences published in patents and patent publication, yielded 
several homologous proteins shown in Table 5D. 



Table 5D. Geneseq Results for NOVSa 


Geneseq 
Identifier 


Protein/Organism/Length 
[Patent #> Date] 


NOVSa 
Residues/ 

Match 


Identities/ 
Similarities for 
the Matched 


Expect 
Value 



123 



WO 03/029424 PCT/US02/31373 







Residues ^* 






AAB03968 


p-21 activated protein kinase 
(PAK1) - Homo sapiens, 545 
aa rWO200060062-A2, 
12-OCT-2000] 


1..545 
1..545 


544/545 (99%) 
545/545 (99%) 


0.0 


AAY55958 


Human STE20-related 
protein kinase PAK1 Jh - 
Homo sapiens, 545 aa. 
rWO9Q53036-A2 
21-OCT4999] 


1..545 
1..545 


541/545 (99%) 
542/545 (99%) 


0.0 


ABG30251 


Novel human diagnostic 
protein #30242 - Homo 

[WO200175067-A2, 
ll-OCT-2001] 


1..542 
7..S57 


474/556 (85%) 
500/556(89%) 


0.0 


AAW72757 


Human doublin - Homo 
sapiens, 544 aa. 
[WO9840495-A1, 
17-SEP-1998] 


3..S44 
2..542 


444/552 (80%) 
489/552 (88%) 


0.0 


ABB57290 


Mouse ischaemic condition 
related protein sequence SEQ 
ID NO:817 - Mus musculus, 
544 aa. [WO200188188-A2, 
22-NOV-20O1] 


3..544 
2..542 


441/552 (79%) 
483/552 (86%) 


0.0 



In a BLAST search of public sequence datbases, the NOV5a protein was found to 
have homology to the proteins shown in the BLASTP data in Table 5E. 
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Table 5E. Public BLASTP Results for NOV5a 


Protein 

Accession 

Number 


Protein/Organism/Length 


NOVSa 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for the 
Matched Portion 


Expect 
Value 


Q13153 


Serine/threonine-protein 
kinasePAKl (EC 2.7.1.-) 
(p21-activated kinase 1) 
(PAK-1) (P65-PAK) 
(Alpha-PAK) - Homo sapiens 
(Human), 545 aa. 


1..545 
1..545 


545/545 (100%) 
545/545 (100%) 


0.0 



124 



WO 03/029424 



PCT/US02/31373 



P35465 


Serine/threonine-protein 
kinase PAK1 (EC 2.7.1.-) 
(p21-activated kinase 1) 
(PAK-1) (P68-PAK) 
(Alpha-PAK) (Protein kinase 
MUK2) - Rattus norvegicus 
(Rat), 544 aa. 


1 ffl 

1..545 R 

1..544 


537/545(98%^ 
539/545 (98%) 


r -I "8, a 'T-'t 


S40482 


serine/threonine-specific 
protein kinase (EC 2.7.1.-) - 
rat, 544 aa. 


1..545 
1..544 


534/545 (97%) 
537/545 (97%) 


0.0 


088643 


Serine/threonine-protein 
kinase PAK1 (EC 2.7.1.-) 
(p21-activated kinase 1) 
(PAK-1) (P65-PAK) 
(Alpha-PAK) (CDC42/RAC 
effector kinase PAK-A) - Mus 
musculus (Mouse), 545 aa. 


1..545 
1..545 


530/545 (97%) 
537/545 (98%) 


0.0 


075561 


P21 activated kinase IB - 
Homo sapiens (Human), 553 
aa. 


1..522 
1..522 


517/522(99%) 
520/522(99%) 


0.0 



PFam analysis predicts that the NOV5a protein contains the domains shown in the 
Table 5F. 
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Table 5F. Domain Analysis of NOVSa 


Pfam Domain 


NOVSa Match Region 


Identities/ 
Similarities 

for the Matched Region 


Expect Value 


PBD 


75..135 


37/64 (58%) 
59/64 (92%) 


3.4e-34 


pkinase 


270..521 


94/291 (32%) 
208/291 (71%) 


5.7e-90 



Example 6. 

10 The NOV6 clone was analyzed, and the nucleotide and encoded polypeptide 

sequences are shown in Table 6A. 



Table 6A> NOV6 Sequence Analysis 

jSEQIDNO:33 [3255 bp | 
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PCT/US02/31373 



GACAGCTTTGGGTGGACCAGTAATGAGGA^ TGAGGC^ 



NOV6a, 
CG142182-01 
DNA Sequence 



CTTCAGCGCTTTGGAAACTTCTTTAGTTGGGACCTCCGGTCATGACCTQATCTATCGTCTGTACCATG 
GAACCATTGTTAACCAGATTGTTTGTAAAGAATGTAAGAACGTTAGCGAGAGGCAGGAAGACTTCTTA 
GATCTAACAGTAGCAGTCAAAAATGTATCCGGTTTGGAAGATGCTCTCTGGAACATGTATGTAGAAGA 
GGAAGTTTTTGATTGTGACAACTTGTACCACTGTGGAACTTGTGACAGGCTGGTTAAAGCAGCAAAGT 
CGGCCAAATTACGTAAGCTGCCTCCTTTTCTTACTGTTTCATTACTAAGATTTAATTTTGATTTTGTG 
AAATGCGAACGCTACAAGGAAACTAGCTGTTATACATTCCCTCTCCGGATTAATCTCAAGCCCTTTTG 
TGAACAGAGTGAATTGGATGACTTAGAATATATATATGACCTCTTCT(^GTTATTATACACAAAGGTG 
GCTGCTACGGAGGCCATTACCATGTATATATTAAAGATGTTGATCATTTGGGAAACTGGCAGTTTCAA 
GAGG AAAAAAGTAAAC CAGATGTG AATCTG AAAG ATCTCCAGAGTGAAGAAGAG ATTGATC ATCCAC T 
GATGATTCTA7^GCAATCTTATTAGAGGAGGAGAATAATCTAATTCCTGTTGATCAGCTGGGCCAGA 
AACTTTTGAAAAAGATAGGAATATCTTGGAACAAGAAGTACAGAAAACAGCATGGACCATTGCGGAAG 
TTCTTACAGCTCCATTCTCAGATATTTCTACTCAGTTCAGATGAAAGTACAGTTCGTCTCTTGAAGAA 
TAGTTC TCTCC AGGCTGAGTCTGATTTCC A71AGGAATGACCAGC AAATTTTC AAGATGC TTCCTCCAG 
AATCCCCAGGTTTAAACAATAGCATCTCCTGTCCCCACTGGTTTGATATAAATGATTCTAAAGTCCAG 
CCAATCAGGGAAAAGGATATTGAACAGCAATTTCAGGGTAAAGAAAGTGCCTACATGTTGTTTTATCG 
GAAATCCCAGTTGCAGAGACCCCCTGAAGCTCGAGCTAATCCAAGATATGGGGTTCCATGTCATTTAC 
TGAATGAAATGGATGCAGCTAACATTGAACTGCAAACCAAAAGGGCAGAATGTGATTCTGCAAACl^AT 
ACTTTTGAATTGCATCTTCACCTGGGCCCTCAGTATCATTTCTTCAATGGGGCTCTGCACCCAGTAGT 
CTCTCAAACAGAAAGCGTGTGGGATTTGACCTTTGATAAAAGAAAAACTTTAGGAGATCTCCGGCAGT 
CAATATTTCAGCTGTTAGAATTTTGGGAAGGAGACATGGTTCTTAGTGTTGCAAAGCTTGTACCAGCA 
GGACTTCACATTTACCAGTCACTTGGCGGGGATGAACTGACACTGTGTGAAACTGAAATTGCTGATGG 



AACCTCTACTTTTAAATGTTCTTCATCTAGACACAAGCAGTGATGGAGAAAAGTGTTGTCAGGTGATA 
GAATCTCC AC ATGTCTTTCC AGCTAATGC AGAAGTGGGCACTGTCC TC ACAGCC TTAGC AATCCCAGC 
AGGTGTCATCTTCATCAACAGTGCTGGATGTCCAGGTGGGGAGGGTTGGACGGCCATCCCCAAGGAAG 
ACATGAGGAAGACGTTCAGGGAGCAAGGGCTCAGA7VATGGAAGCTCAATTTTAATTCAGGATTCTCAT 
G ATGAT AAC AGC TTGTTG ACC AAGGAAGAG AAATGGGTC ACTAGTATGAATGAGATTGACTGGC TCCA 
CGTTAAAAATTTATGCCAGTTAGAATCTGAAGAG2VAGCAAGTTAAAATATCAGCAACTGTTAACACAA 
TGGTGTTTGATATTCGAATTAAAGCCATAAAGGAATTAAAATTAATGAAGGAACTAGCTGACAACAGC 
TGTTTGAG ACC T ATTGATAGAAATGGGAAGCTTC TTTGTCCAGTGC CGGAC AGC TATACTTTGAAGGA 



TCCTGTTTTTTGCAATGGGGAGTGACGTTCAACCTGGGACAGAAATGGAAATCGTAGTAGAAGAAACA 
ATATCTGTGAGAGATTGTTTAAAGTTAATGCTGAAGAAATCTGGCCTACAAGACTCCTTTATAGGAGA 
TGCCTGGCATTTACGAAAAATGGATTGGTGCTATGAAGCTGGAGAGCCTTTATGTGAAGAAGATGCAA 
CACTGAAAGAACTTCTGATATGTTCTGGAGATACTTTGCTTTTAATTGAAGGACAACTTCCTCCTCTG 
GGTTTCCTGAAGGTGCCCATCTGGTGGTACCAGCTTCAGGGTCCCTCAGGACACTGGGAGAGTCATCA 
GGACCAGACCAACTGTACTTCGTCTTGGGGCAGAGTTTGGAGAGCCACTTCCAGCCAAGGTGCTTCTG 
GGAACGAGCCTGCGCAAGTTTCTCTCCTCTACTTGGGAGACATAGAGATCTCAGAAGATGCCACGCTG 



CCTCAGAGCCTGGACGGTGGAGAGGAAGCGCCCAGGCAGGCTTTTACGAACTGACCGGCAGCCACTCA 
GGGAATATAAACTAGGACGGAGAATTGAGATCTGCTTAGAGCCCCTTCAGAAAGGCGAAAACTTGGGC 
CCCCAGGACGTGCTGCTGAGGACACAGGTGCGCATCCCTGGTGAGAGGACCTATGCCCCTGCCCTGGA 
CCTGGTGTGGAACGCGGCCCAGGGTGGGACTGCCGGCTCCCTGAGGCAGAGAGTTGCCGATTTCTATT 
GTCTTCCCGTGGAGAAGATTGAAATTGCCAAATACTTTCCCGAAAAGTTCGAGTGGCTTCCGATATCT 
AGCTGGAACCAACAAATAACCAAGAGGAAAAAAAAAAAAAAACAAGATTATTTGCAAGGGGCACCGTA 
TTACTTGAAAGACGGAGATACTATTGGTGTTAAGGTAAGTTGTTTAACAGC^U^TTTACC^CTTTGAG 
AAGACACGAGGGTCACATGATTTTATAGAGACGTTTTATTGAATCTTCAAGACACAGAT 



ORF Start: ATG at 31 



ORFStop: TGAat 3193 





SEQ ID NO: 34 


1054 aa |MW at 119613.5kD 


NOV6a, 
CG142182-01 
Protein 
Sequence 


mrqhdvqelnrilfsaletslvgtsghdliyrlyhgtiwqivckecknvserqedfldltvavknvs 

gledalwnmyveeevfix:dnlyhcgtcdrlvkaaksakijiklppflwsllrfnfdfvkc 

ytfplrinlkpfceqselddleyiydlfsviihkggcygghyhvyikdvdhlgnwqfqeekskpdvnl 

kdlqseeeidhpi^ilkailleeel^ipvdqlgqkllkkigiswnkkyrkqhgplrkflqlhsqifl 

lssdestvrllknsslqaesdfqrndqqifkmlppespglnnsiscphwfdindskvqpirekdieqq 

fqgkesaymlfyrksqlqrppearanprygvpchllnemdaanielqtkraecdsanntfelhlhlgp 

q yhfft^galhpwsqtesvwdltfdkrktlgdlrqsi fqllefwegdmvlsvaklvpaglh i yqslgg 

deltlceteiadgedifvwngvevggvhiqtgidceplllnvlhldtssdgekccqviesphvfpana 

evgtvltalai pagvi f ins agc pggegwtai pkedmrktfreqglrngss il iqdshddnslltkee 

kwvtsmneidwlhvknlcqleseekqvkisatvn^ 

llcfvpdsytlkeaelkmgsslglclgkapsssqlflffamgsdvqpgtemeiweetisvrdclklm 
lkksglqdsfigdawhlrkmotcyeageplceedatlkellicsgdtllliegqlppix3flkvpiwwy 

QLQGPSGHWESHQDQTNCTS SWGRVWRATS SQGASGNEPAQVSLL YLGDI E I SEDATLAELKSQAMTL 
PPFLEFGVPSPAHLRAWTVERKRPGRLLRTDRQPLREYKLGRRIEICLEPLQKGENLGPQDVLLRTQV 
R I PGERTYAPALDLVWNAAQGGTAGSLRQRVADFYCLPVEKIEI AKYFPEKFEWLP I SSWNQQ ITKRK 
KKKKQDYLQGAP YYLKDGDT IGVKVSCLTANLPL 
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Further analysis of the NOV6a protein yielded the following properties shown in 
Table 6B. 
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Table 6B. Protein Sequence Properties NOV6a 


PSort analysis: 


0.7000 probability located in plasma membrane; 0.3500 probability located in 
nucleus; 0.3000 probability located in microbody (peroxisome); 0.2000 
probability located in endoplasmic reticulum (membrane) 


SignalP analysis: 


No Known Signal Sequence Predicted 



A search of the NOV6a protein against the Geneseq database, a proprietary 
database that contains sequences published in patents and patent publication, yielded 
10 several homologous proteins shown in Table 6C. 



Table 6C. Geneseq Results for NOV6a 


Geneseq 
Identifier 


Protein/Organism/Lengtb 
[Patent #, Date] 


NOV6a 
Residues/ 
Matcb 
Residues 


Identities/ 
Similarities for the 
Matched Region 


Expect 
Value 


AAE14346 


Human protease PRTS-1 1 
protein - Homo sapiens, 
1108aa. 

[WO200183775-A2, 
08-NOV-2001] 


1..1044 
1..1040 


1037/1044 (99%) 
1037/1044 (99%) 


0.0 


AAU68535 


Human novel cytokine 
encoded by cDNA 
790OP2C_6#l-Homo 
sapiens, 1346 aa. 
[WO200175093-A1, 
ll-OCT-2001] 


1..1044 
129..1167 


1037/1044(99%) 
1038/1044 (99%) 


0.0 


AAB93169 


Human protein sequence 
SEQ ID NO: 12102 -Homo 
sapiens, 1014 aa. 
[EP1074617-A2, 
07-FEB-2001] 


1..1019 
1..1014 


1013/1019 (99%) 
1013/1019 (99%) 


0.0 


AAU68534 


Human novel cytokine 
encoded by cDNA 
790CIP2C_5#1-Homo 
sapiens, 1324 aa. 
[WO200175093-A1, 
ll-OCT-2001] 


1..1044 
129..1145 


1015/1044(97%) 
1015/1044 (97%) 


0.0 



127 
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PC17US02/31373 



5 



ABG27066 


Novel human diagnostic 
protein #27057 - Homo 
sapiens, 674 aa. 
[WO200175067-A2, 
ll-OCT-2001] 


jp 

500..666 tf 

47..214 


166/168 (98%) 




In a BLAST search of public sequence datbases, the NOV6a protein was found to 
have homology to the proteins shown in the BLASTP data in Table 6D. 


Table 6D. Public BLASTP Results for NOV6a 


Protein 

Accession 

Number 


Protein/Organism/Length 


NOV6a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for the 
Matched Portion 


Expect 
Value 


Q9NVE5 


CDNAFLJ 10785 fis, clone 
NT2RP4000457, weakly 
similar to ubiquitin 
pflrhnxvl-tpiminal hvdrolase 

well UvA V X iVAlJJULUClJl lljr VJJl VJlOOv 

15 (EC 3.1.2.15) -Homo 
sapiens (Human), 1014 aa 
(fragment). 


1..1019 
1..1014 


1013/1019 (99%) 
1013/1019 (99%) 


0.0 


Q95KB6 


Hypothetical 102.2 kDa 
protein - Macaca fascicularis 
(Crab eating macaque) 
(Cynomolgus monkey), 907 
aa (fragment). 


143.. 1024 
30..907 


844/882(95%) 
860/882(96%) 


0.0 


Q8S1J6 


Putative ubiquitin 
carboxyl-terminal hydrolase 
- Oryza sativa (japonica 
cultivar-group), 1079 aa. 


3.-342 
223..568 


102/359(28%) 
165/359(45%) 


3e-23 


Q8VZM4 


Putative ubiquitin 
carboxyl-terminal hydrolase 
- Arabidopsis thaliana 
(Mouse-ear cress), 683 aa. 


3-202 
278..480 


72/205 (35%) 
105/205 (51%) 


3e-23 


Q94ED6 


Putative ubiquitin 
carboxyl-terminal hydrolase 
- Oryza sativa (Rice), 1108 
aa. 


3-342 
273-618 


102/359 (28%) 
165/359(45%) 


3e-23 



PFam analysis predicts that the NOV6a protein contains the domains shown in the 
10 Table 6E. 
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Table 6E. Domain Analysis of NOV6a 


Pfam Domain 


NOV6a Match Region 


Identities/ 
Similarities 

for the Matched Region 


Expect Value 


UCH-2 


157.354 


23/203(11%) 
141/203 (69%) 


0.00033 



5 Example 7. 

The NOV7 clone was analyzed, and the nucleotide and encoded polypeptide 
sequences are shown in Table 7A. 



Table 7A, NOV7 Sequence Analysis 




SEQIDNO:35 |692bp | 


NOV7a, 
CG142564-01 
DNA Sequence 


GACAGGAGTGAACCCGAGCTGTGCCGACCAACCCCCAGGATGGCGGAAGCTCACCAGGCCGTGGCCTT 


CCAGTTCACGGTGACCCCAGACGGGGTCGACTTCCGGCTCAGTCGGGAGGCCCTGAAACACGTCTACC 
TGTCTGGGATCAACTCCTGGAAGAAACGCCTGATCCGCATCAAGAATGGCATCCTCAGGGGCGTGTAC 
C C TGGCAGCCC C ACC AGCTGGCTGGTCGTCATCATGGTAAC AGTGGGTT CCTCC TTC TGC AAC GTGGA 
CATCTCCTTGGGGCTGGTCAGTTGCATCCAGAGATGCCTCCCTCAGGGGTGTGGCCCCTACCAGACCC 
CGCAGACCCGGGCACTTCTCAGCATGGCCATCTTCTCCACGGGCGTCTGGGTGACGGGCATCTTCTTC 
TTCCGCCAAAC CCTGAAGCTGCTTC TCTGCTACCAATCCCAGATCCGCATGTTCGACCC AGAGC AGCA 
CCCCAATCACCTGGGCGCTGGAGGTGGCTTTGGCCCTGTAGCAGATGATGGCTATGGAGTTTCCTACA 
TGATTGCAGGCG AGAAC ACGATCTTC TTCCAC ATC TCCAGCAAGTTCTCAAGCTC AGAGACGAACGCC 
CAGCGCTTTGGAAACCACATCCGCAAAGCCCTGCTGGACATTGCTGATCTTTTCCAAGTTCCTCAGGC 
C T ACAGCTGAAG 




ORF Start: ATG at 40 | jORF Stop: TGA at 688 
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SEQ ID NO: 36 |216 aa |MW at 23874.3kD 


NOV7a, 
CG142564-01 
Protein 
Sequence 


MAEMQAVAFQFT^PDGVDFRLSREALKHVYLSGINSWKKRLIRIKNG 

TVGSSFCNVDISLGIiVSCIQRCLPQGCGPYQTPQTRALLSMAIPSTGVWVTGIFFFRQTLKLLLCYQS 
Q I RMFDPEQHPNHLGAGGGFGFVADDGYGVSYMI AGENT IFFH IS SKFS S S ETNAQR FGNH I RKALLD 
IADLFQVPQAYS 



15 

Further analysis of the NOV7a protein yielded the following properties shown in 
Table 7B. 

20 

Table 7B. Protein Sequence Properties NOV7a 
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WO 03/029424 



PCT/US02/31373 



PSort analysis: 


~— — : — ; — 77— p< c 1 >;usa u c z 1 4 jl / , 

0.7900 probability located m plasma membrane; 0.64(30 probability located in 
microbody (peroxisome); 0.3000 probability located in Golgi body; 0.2000 
probability located in endoplasmic reticulum (membrane) 


SignalP analysis: 


Cleavage site between residues 5 and 6 



A search of the NOV7a protein against the Geneseq database, a proprietary 
database that contains sequences published in patents and patent publication, yielded 
5 several homologous proteins shown in Table 7C. 



Table 7C. Geneseq Results for NOV7a 


Geneseq 
Identifier 


Protein/Organism/Length 
[Patent #, Date] 


NOV7a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for 
the Matched 
Region 


Expect 
Value 


AAW14438 ! 


Type I carnitine palmitoyl 
transferase-like protein - 
Homo sapiens, 772 aa. 
[JP09009969-A, 

1/1 T A KT 1 0fV71 

14-JAiN-iyy /j 


1..134 
1..134 


131/134 (97%) 
131/134(97%) 


4e-72 


AAE10322 


Human carnitine 
acyltransferase, 26886 - 
Homo sapiens, 803 aa. 
[WO200166759-A2, 
13-SEP-2001] 


1..134 
1..132 


57/134(42%) 
78/134(57%) 


le-21 


AAY79220 


Human transferase 
TRNSFS-12 - Homo sapiens, 
803 aa. [WO200014251-A2, 
16-MAR-2000] 


1..134 
1.-132 


57/134(42%) 
78/134 (57%) 


le-21. 


ABB67527 


Drosophila melanogaster 
polypeptide SEQ ID NO 
29373 - Drosophila 
melanogaster, 780 aa. 
[WO200171042-A2, 
27-SEP-2001] 


137..210 
688..761 


43/74 (58%) 
55/74(74%) 


6e-19 


ABB66942 


Drosophila melanogaster 
polypeptide SEQ ID NO 
27618 - Drosophila 
melanogaster, 782 aa. 
[WO200171042-A2, 
27-SEP-2001] 


137..210 
690..763 


43/74(58%) 
55/74(74%) 


6e-19 
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In a BLAST search of public sequence datbases, tMNW^a^^^as fSWd-tb * 
have homology to the proteins shown in the BLASTP data in Table 7D. 



Table 7D. Public BLASTP Results for NOV7a 


Protein ! 

Accession 

Number 


Protein/Organism/Length 


NOV7a 
Residues/ 
Match 
Residues 


Identities/ j 
Similarities for 
the Matched 
Portion 


Expect 
Value 


Q9BY90 


KIAA1670 protein - Homo ' 
sapiens (Human), 598 aa 

\lt aglllCIlly . 


1..134 
18..151 


133/134(99%) 
133/134(99%) 


2e-73 


Q92523 


Carnitine 

O-palmitoyltransferase 1, 

Tr%it/^r»hrMiHrijil mncf*lp icrrfirvrnn 

JUlilULlltJllUl Jal lllUaWlC Jo UAL/1 HI 

(EC 2.3.1.21) (CPTI) 
(CPTI-M) (Carnitine 
palmitoyltransferase I like 
protein) - Homo sapiens 
(Human), 772 aa. 


1..134 
1..134 


133/134(99%) 
133/134(99%) 


2e-73 


Q924X2 


Muscle-type carnitine 
palmitoyltransferase I (EC 
2.3.1.21) (Hypothetical 88.2 
kDa protein) - Mus musculus 
(Mouse), 772 aa. 


1..149 
1..147 


118/149(79%) 
128/149 (85%) 


le-63 


035287 


Carnitine palmitoyltransferase 
I - Mus musculus (Mouse), 
772 aa. 


1..149 
1..147 


118/149(79%) 
128/149 (85%) 


le-63 


Q9QYP4 


Muscle type carnitine 
palmitoyltransferase I - Mus 
musculus (Mouse), 772 aa. 


1..149 
1..147 


118/149(79%) 
128/149(85%) 


le-63 
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Example 8. 

The NOV8 clone was analyzed, and the nucleotide and encoded polypeptide 
10 sequences are shown in Table 8 A. 



Table 8A. NOV8 Sequence Analysis 




SEQIDNO:37 Jll22bp | | 


NOV8a, 
CG142797-01 
DNA Sequence 


CTAGAT TTTTG AAAC ATGAATCCT TCAC TC C TCC TGGC TGCC TTTTTCC TGGGAATTGC C TCAGC TGC 
TCTAACACGTGACCACAGTCTAGACGCACAATGGACCAAGTGGAAGGCAAAGCACAAGAGATTATA 
AC ATGGAG AAC ATGAAG ATG AC TGAGCAGC ACAATCAGG AATACAGC CAAGG GAAACACAGC TTC ACA 
ATGGCCATGAACACCTTTGGAGACATGACCACTGAAGAATTCAGGCAGGTGATGAATGGTTTTCAATA 
CCAGAAGCACAGGAACGGGAAACAGTTCCAGGAACGCCTGCTTCTTGAGATC 



131 



WO 03/029424 



PCT/US02/31373 





(aljAtjAoAt^AA/vjVj^ 1 vy/iL. iv.v» 1 vs lAjiuiuonl wwvsurrwivj a o xvsvjv. i\# x a cr l ivjwjv, 111 l/iw i 

GCAACTGGTGCTCTGGAAGGGCAGATGTTCTGGAAAACA 

TCTGGT AGAC TGC TC TGGGC CTC AAGG CAATG AGGG C TG C AATGGTGG CTTCATGG ATAATCCC TTC C 

GGTATCTTCAGGAGAACGGAGGCCTGGACTCTGAGG^ 

AATCCCAAGTATTCTGCTGCTAATGACACTGGCTTTC 




ATAAAAAAGGTATTTATTTTGAGCCACGCTGTGACCC^^ 

GGCTACAGCTATGAAGGAGCAGACTCAGATAACAATAAATATTGGCTGGTGAAGAACAGGTATGGTAA 
AAACTGGGGCATGGATGGCTACATAAAGATGGCCAAAGACCGGAGGAACAACTGTGGAATTGCCACAG 
PAf;rCAf:nTACCCCACTGTGTGAGCTGATGGATGGTC 


CCAAAGGAGGAATTTATCTTCAATCTACCAGCCCCTGCTGTGTGGAATGCGCACTTCAATCATTGAAG 


ATCCAAGTGTGATTGGAATTCTGATATTTTCACA 




ORF Start: ATG at 16 | |0RF Stop: TGA at 973 





SEQIDNO: 38 


319 aa 


MWat35984.2kD 


N0V8a, 
CG142797-01 
Protein 
Sequence 


MNPSLLLAAFFLGIASAALTl^HSLDAQWT^^ 

FGDMTTEEFRQVl^GPQYQKHRNGKQFQElUiLLEIPTSVDWRKKGYMTPVKDQGQCGSCWAFS 
EGQMFWKTGKLISLNEQNLVDCSGPQGN^^ 
AAITOTGFVDIPSQEKDLAKAVAWGPISVAAGASHVSFQFY 
GADSDNNKYTO VKOTYGKNWGMDG YI KMAKDRRNNCGI ATAAS YPTV 



Further analysis of the NOV8a protein yielded the following properties shown in 
Table 8B. 
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Table 8B. Protein Sequence Properties NOV8a 


PSort analysis: 


0.8200 probability located in endoplasmic reticulum (membrane); 0.5140 
probability located in plasma membrane; 0.2423 probability located in 
microbody (peroxisome); 0.1000 probability located in endoplasmic reticulum 
(lumen) 


SignalP analysis: 


Cleavage site between residues 18 and 19 



A search of the NOV8a protein against the Geneseq database, a proprietary 
15 database that contains sequences published in patents and patent publication, yielded 
several homologous proteins shown in Table 8C. 



Table 8C Geneseq Results for NOV8a 


Geneseq 
Identifier 


Protein/Organism/Length 
[Patent*, Date] 


NOV8a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for 
the Matched 
Region 


Expect 
Value 



132 



WO 03/029424 



PCT7US02/31373 



5 



AAU98883 


Human protease PRTS1 - 
Homo sapiens, 334 aa. 
fWO200238744-A2. 
16-MAY-2002] 


I jJSn 

1..319 
1..334 


310/334 (92%) 


e-180 


ABG61771 


Novel cathepsin-L 
precursor-like protein - Homo 
sapiens, 333 aa. 
[WO200229058-A2, 
ll-APR-2002] 


1..319 
1..333 


288/333 (86%) 
300/333 (89%) 


e-171 


ABG66692 


Human novel polypeptide #27 
- Homo saDiens 333 aa. 
[WO200244340-A2, 
06-JUN-2002] 


1..319 
1..333 


260/333 (78%) 
278/333 (83%) 


e-154 


ABG66714 


Human novel polypeptide #49 

_ TTr\rr»n QaniprK aa 

1 X\J111\J od^JI t/llo, J J J aa. 

[WO200244340-A2, 
06-JUN-2002] 


1.319 
1..333 


259/333 (77%) 
277/333 (82%) 


e-154 


ABB77396 


Human cathepsin L - Homo 
sapiens, 333 aa. 
[DE10050274-A1, 
18-APR-2002] 


1..319 
1..333 


(74%) 
274/333 (81%) 


e-147 


In a BLAST search of public sequence datbases, the NOV8a protein was found to 
have homology to the proteins shown in the BLASTP data in Table 8D. 


Table 8D. Public BLASTP Results for NOV8a 


Protein 

Accession 

Number 


Protein/Organism/Length 


NOV8a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for 
the Matched 
Portion 


Expect 
Value 


P07711 


Cathepsin L precursor (EC 
3.4.22.15) (Major excreted 
protein) (MEP) - Homo 
sapiens (Human), 333 aa. 


1..319 
1..333 


249/333 (74%) 
274/333 (81%) 


e-147 


Q9GKL8 


Cysteine protease - 
Cercopithecus aethiops (Green 
monkey) (Grivet), 333 aa. 


1..319 
1..333 


247/333 (74%) 
273/333 (81%) 


e-146 


Q9GL24 


Cathepsin L (EC 3.4.22.15) - 
Canis familiaris (Dog), 333 aa. 


1..319 
1..333 


236/334(70%) 
265/334 (78%) 


e-138 


Q28944 


Cathepsin L precursor (EC j 
3.4.22.15) - Sus scrofa (Pig), 
334 aa. 


1..319 
1..334 


228/334 (68%) 
263/334(78%) 


e-135 



133 



WO 03/029424 



PCT/US02/31373 



P25975 


Cathepsin L precursor (EC 


¥ 

1..319 


222/3^4 (66%) 


e-133 




3.4.22.15) -Bos taurus 


L.334 


261/334 (77%) 






(Bovine), 334 aa. 









PFam analysis predicts that the NOV8a protein contains the domains shown in the 
Table 8E. 
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Table 8E. Domain Analysis of NOV8a 


Pf am Domain 


NOV8a Match Region 


Identities/ 
Similarities 

for the Matched Region 


Expect Value 


Peptidase_Cl 


103..318 


123/337 (36%) 
194/337 (58%) 


2.4e-lll 



Example 9. 

10 The NOV9 clone was analyzed, and the nucleotide and encoded polypeptide 

sequences are shown in Table 9A. 



Table 9A. NOV9 Sequence Analysis _ 




SEQIDNO:39 |l740bp j 


NOV9a, 
CG143216-01 
DNA Sequence 


CACGAGGCCGCTAACGGTCCGGCGCCCCTCGGCGTCCGCGCGCCCCCAGCCTGGCGGACGAGCCCGGC 


GGCGGAGATGGGGGCGACGGGGGCGGCGGAGCCGCTGCAATCCGTGCTGTGGGTGAAGCAGCAGCGCT 
GCGCCGTGAGCCTGGAGCCCGCGCGGGCTCTGCTGCGCTGGTGGCGGAGCCCGGGGCCCGGAGCCGGC 
GCCCCCGGTGC TGATGCC TGC TCTGTGCCTGTATCTGAGATCATCGCCGTTGAGGAAACAGACGTTCA 
CGGGAAACATCAAGGCAGTGGAAAATGGCAGAAAATGGAAAAGCCTTACGCTTTTACAGTTCACTGTG 
TAAAGAGAGCACGACGGCACCGCTGGAAGTGGGCGCAGGTGACTTTCTGGTGTCCAGAGGAGCAGCTG 
TGTC ACTTGTGGCTGC AG AC CCTGCGGGAGATGCTGGAGAAGC TGACGTCCAGACC AAAGCATTTACT 
GGTATTTATCAACCCGTTTGGAGGAAAAGGACAAGGCAAGCGGATATATGAAAGAAAAGTGGCACCAC 
TGTTCACCTTAGCCTCCATCACCACTGACATCATCGTTACTGAACATGCTAATCAGGCCAAGGAGACT 
CTGTATGAGATTAACATAGACAAATACGACGGCATCGTCTGTGTCGGCGGAGATGGTATGTTCAGCGA 
GGTGCTGCACGGTCTGATTGGGAGGACGCAGAGGAGCGCCGGGGTCGACCAGAACCACCCCCGGGCTG 
TGCTGGTCCCCAGTAGCCTCCGGATTGGAATCATTCCCGCAGGGTCAACGGACTGCGTGTGTTACTCC 
ACCGTGGGCACCAGCGACGCAGAAACCTCGGCGCTGCATATCGTTGTTGGGGACTCGCTGGCCATGGA 
TGTGTCCTCAGTCCACCACAACAGCACACTCCTTCGCTACTCCGTGTCCCTGCTGGGCTACGGCTTCT 
ACGGGGACATCATCAAGGACAGTGAGAAGAAACGGTGGTTGGGTCTTGCCAGATACGACTTTTCAGGT 
TTAAAGACCTTCCTCTCCCACCACTGCTATGAAGGGACAGTGTCCTTCCTCCCTGCACAACACACGGT 
GGGATCTCCAAGGGATAGGAAGCCCTGCCGGGCAGGATGCTTTGTTTGCAGGCAAAGCAAGCAGCAGC 
TGGAGGAGGAGCAGAAGAAAGCACTGTATGGTTTGGAAGCTGCGGAGGACGTGGAGGAGTGGCAAGTC 




CCTCTCCCCGGCTGCCCACTTGGGAGACGGGTCTTCTGACCTCATCCTCATCCGGAAATGCTCCAGGT 
TCAATTTTC TGAG ATTTCTC AT(^GGCACACCAACCAGCAGGACCAGTTTGAC TTCACTTTTGTTGAA 
GTTTATCGCGTCAAGAAATTCCAGTTTACGTCGAAGCACATGGAG^ 

GGGGGGGAAGAAGCGCTTTGGGCACATTTGCAGCAGCCACCCCTCCTGCTGCTGCACCGTCTCCAACA 
GCTCCTGGAACTGCGACGGGGAGGTCCTGCACAGCCCTGCCATCGAGGTCAGAGTCCACTGCCAGCTG 
r2 TT r ' r2 ft '^T^'^TrtP JVPG ARR A aTTRA Aft AG AATCCGA AGC!C!AG ACTCACACAGCT6AG AAGCCGGCGT 
C C TGC TC TC G AAC TGGGAAAGTGTG AAAAC T ATTT AAGAT 




ORF Start: ATG at 76 J |ORF Stop: TGA at 1687 



15 

134 



WO 03/029424 



PCTAJS02/31373 





SEQ ID NO: 40 |537 aa jMW at 59976.9kD 


NOV9a, 
CG143216-01 
Protein 
Sequence 


MGATGAAEPLQSVLWVKQQRCAVSLEPARALLRWWRSPGPGAGAPGADACSVPVSEIIAVEETDVHGK 
HQGSGKWQKMEKPYAF TVHCVKRARRHRWKWAQVTFWC PEEQLCHL 

INPFGGKGQGKRIYERKVAPLFTLASITTDIIVTEHANQAKETLYElNIDKYtXSIVCVGGDGMFSEVL 
HGL IGRTQRSAGVDQNHPRAVLVPS SLRIGII PAGSTIXTVCYS WGTSDAETSALHI VVGDSLAMDVS 
SVHHNSTLLRYSVSLLGYGFYGDIIKDSEKKRWLGLARYDFSGLKTFLSHHCYEGTVSFLPAQHTVGS 
PRDRKPCRAGCFVCRQSKQQLEEEQKKAIiYGLEAAEDVEEWQWCGKFLAINATNMSCACRRSPRGLS 
P AAHLGDGS SDL I L I RK.C SRFNFLRFIj IRHTNQQDQFDFTFVEVYRVKKFQFTSKHMEDEDSDLKEGG 
KKRFGH ICS SHPSCCC TVSNS SWNCDGEVLHS PAIEVRVHCQLVRLFARGIEENPKPDSHS 



5 

Further analysis of the NOV9a protein yielded the following properties shown in 
Table 9B. 



Table 9B. Protein Sequence Properties NOV9a 


PSort analysis: 


0.5121 probability located in microbody (peroxisome); 0.3000 probability 
located in nucleus; 0.1000 probability located in mitochondrial matrix space; 
0.1000 probability located in lysosome (lumen) 


SignalP analysis: 


No Known Signal Sequence Predicted 



10 

A search of the NOV9a protein against the Geneseq database, a proprietary 
database that contains sequences published in patents and patent publication, yielded 
several homologous proteins shown in Table 9C 

15 



Table 9C. Geneseq Results for NOV9a 


Geneseq 
Identifier 


Protein/Organism/Length 
[Patent*, Date] 


NOV9a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for the 
Matched Region 


Expect 
Value 


ABB07857 


Human sphingosine 
kinase-like protein - Homo 
sapiens, 562 aa. 
[WO200228906-A2, 
ll-APR-2002] 


1..537 
26..562 


537/537(100%) 
537/537(100%) 


0.0 


ABB07856 


Human sphingosine 
kinase-like protein - Homo 
sapiens, 537 aa. 
[WO200228906-A2, 
ll-APR-2002] 


1..537 
1..537 


537/537(100%) 
537/537(100%) 


0.0 



135 



WO 03/029424 



PCT7US02/31373 



AAM49115 


Human ceramide kinase 
hCTRTC 1 - Homo saoiens 
537 aa. [WO200196575-A1, 
20-DEC-2001] 


i m 

L.537 
1..537 


536/537 (99%) 


, m Jt JL. 4.^" * 

0.0 


AAY96059 


Human sphingosine kinase C 
- Homo sapiens, 460 aa. 
rWO200052173-A2 
08-SEP-2000] 


78..537 
1..460 


458/460 (99%) 
459/460 (99%) 


0.0 


AAE07884 


Human sphingosine kinase 
(SphK) protein #2 - Homo 
sapiens, 471 aa. 
[WO200160990-A2, 
23-AUG-2001] 


78..537 
1..471 


459/471 (97%) 
460/471 (97%) 


0.0 


In a BLAST search of public sequence datbases, the NOV9a protein was found to 
have homology to the proteins shown in the BLASTP data in Table 9D. 


Table 9D. Public BLASTP Results for NOV9a 


Protein 

Accession 

Number 


Protein/Organism/Length 


NOV9a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for the 
Matched Portion 


Expect 
Value 


Q8TCT0 


Putative lipid kinase - Homo 
sapiens (Human), 537 aa. 


1..537 
1..537 


537/537 (100%) 
537/537(100%) 


0.0 


Q9BYB3 


KIAA1646 protein - Homo 
sapiens (Human), 481 aa 
(fragment). 


5Z.537 
1..481 


481/481 (100%) 
481/481 (100%) 


0.0 


BAC01155 


Ceramide kinases - Mus 
musculus (Mouse), 531 aa. 


1..529 
1..529 


450/529 (85%) 
483/529 (91%) 


0.0 


Q9UGE5 


DA59H18.2 (Novel protein 
similar to human, mouse, 
yeast, worm and plant 
(Predicted) proteins) - Homo 
sapiens (Human), 326 aa 
(fragment). 


130..444 
1..326 


314/326(96%) 
315/326 (96%) 


0.0 


Q9TZI1 


T10B11.2 protein - 
Caenorhabditis elegans, 549 
aa. 


79..52S 
115..526 


141/458 (30%) 
230/458 (49%) 


le-52 



PFam analysis predicts that the NOV9a protein contains the domains shown in the 
Table 9E. 



136 



WO 03/029424 



PCI7US02/31373 



Table 9E. Domain Analysis of NOV9a 


Pfam Domain 


NOV9a Match Region 


Identities/ 
Similarities 

for the Matched Region 


Expect Value 


PH 


32.. 124 


9/93 (10%) 
64/93 (69%) 


0.38 


DAGKc 


132..278 


32/165 (19%) 
100/165 (61%) 


0.00015 



5 Example 10. 

The NOV 10 clone was analyzed, and the nucleotide and encoded polypeptide 
sequences are shown in Table 10A. 



Table 10A. NOV10 Sequence Analysis 




SEQIDNO:41 |772bp 




NOVlOa, 
CG143787-01 
DNA Sequence 


AACTGGAGACCACAACTTCATGCTGCGTGGGATCTCCCAACTACCTGCAGTGGCCACCATGTCTTGG 
GTCC TGCTGCC TGTACTTTGGCTC ATTGTTCAAACTC AAGCAATAGC CATAAAGCAAACACCTGAAT 
TAACGCTCCATGAAATAGTTTGTCCTAAAAAACTTCACATTTTACACAAAAGAGAGATCAAGAACAA 
CCAGACAGAAAAGCATGGCAAAGAGGAAAGGTATGAACCTGAAGTTCAATATCAGATGATCTTAAAT 
GGAGAAGAAATC ATTC TCTCCCTACAAAAAACC AAGCACCTCC TGGGGCCAGACTACACTGAAACAT 
TGTACTCACCCAGAGGAGAGGAAATTACCACGAAACCTGAGAACATGGAACACTGTTACTATAAAGG 
AAACATCCTT^AATGAAAAGAATTCTGTTGCCAGCATCAGTACTTGTGACGGGTTGAGAGGATACTTC 
ACAC ATCATC ACCAAAGATAC C TTTTATCTCAGAAACCAAAGTGCC TGC TGC AAGCACCTATTCCTA 
C AAATATAATGAC AAC ACC AGTGTGTGGGAACC ACC TTCTAGAAGTGGGAGAAGAC TGTGATTGTGG 
CTCTC TTAAGGAGTGTACCAATC TCTGCTGTGAAGC CC TAACGTGTAAAC TGAAGCCTGGAACTGAT 
TGrr^AGGAGATGCTCCAAACCATACCACAGAGTOAATCCAAAAGTCTGCTTCACTGAGATGCTACC 


TTGC C AGGAC AAG AAC CAAGAAC TC T AACTGTC CC 




ORF Start: ATG at 20 |ORF Stop: TGA at 704 



10 





SEQ ID NO: 42 |228 aa |MW at 25718.4kD 


NOVlOa, 
CG143787-01 
Protein Sequence 


MLRG I SQL PAVATMS WVLL PVLWL IVQTQAI AI KQTPELTLHE I VC PKKLHILHKRE I KNNQTEKHG 
KEERYEPEVQYQMILNGEEIILSLQKTKHLI^PDYTETLYSPRGEEITTKPENMEHCYYKGNILNEK 
NSVAS I STCDGLRGYFTHHHQRYLLSQKPKCLLQAP I P TNIMTT PVCGNHLLEVGEDCDCGSLKECT 
NLCCEALTCKLKPGT3XGGDAPNHTTE 



15 





SEQ ID NO: 43 |706bp 


NOVlOb, 
278889162 DNA 
Sequence 


C!ACCGGATCCACCATGCTGCGTGGGATCTCCCAACTACCTGCAGTGGCCACCATGTCTTGGGTCCTG 

CTGCCTGTACTTTGGCTCATTGTTCAAACTCAAGCAATAGCCATAAAGCAAACACC 

TCCATG AAATAGTTTGT CC TAAAAAAC TTCACATTTTACACAAAAG AGAG AT CAAGAAC AAC C AGAC 

AGAAAAGCATGGCAAAGAGGAAAGGTATGAACCTGAAGTTCAATATCAGATGATCTTAAATGGAGAA 



137 



WO 03/029424 



PCT/US02/31373 



GAAATCATTCTCTCCC^ 

CACCCAGAGGAGAGGAAATTACCACGAAACCTGAGAACATGGAACACTGTTACTATAAAGGAAACAT 
CCTAAATGAAAAGAATTCTGTTGCCAGCATCAGTACTTGTGACGGGTTGAGAGGATACTTCACACAT 
CATCACCAAAGATACCTTTTATCTCAGAAACCAAAGTGCCTGCTGCAAGCACCTATTCCTACAT^ATA 
TAATGACAACACCAGTGTGTGGGAACCACCTTCTAGAAGTGGGAGAAGACTGTGATTGTGGCTCTCT 
T AAGGAG TGTAC C AATC TCTGC TG TGAAGCC C T AACGTG T AAACTGAAGC CTGGAAC TG AT TG CGG A 
GGAGATGC TCC AAACCATACC AC AGAGCTCGAGGGC 

ORF Start: at 2 |ORF Stop: end of sequence 





SEQIDNO:44 


235 aa ImW at 26364. lkD 


NOVlOb, 
278889162 
Protein Sequence 


TGSTMLRG I S QL PAVATM SWVLLP VLWL I VQTQAI A IKQT PELTLHE I VC PKKLH I LHKRE I KNNQT 
EKHGKEERYEPEVQYQMILNGEEIILSLQKT^ 

lneknsvasistcix3lrgyfthhhqryllsqkpkcllqapiptnimttpvcgnhllevgedcix:g 

KECTNLCCEALTCKLKPGTDCGGDAPNHTTELEG 





SEQIDNO:45 ll!8bp | 


NOVlOc, 
278689868 DNA 
Sequence 


QAC CGGATCCGAAGTGGGAGAAGACTGTGATTGTGGCTCTC TTAAGGAGTGTACCAATCTCTGCTGT 
G AAGC C C TAACGTGT AAAC TGAAGC C TGGAAC TGATTG CGG AC T C GAGGG C 




ORF Start: at 2 jORF Stop: end of sequence 





SEQ ID NO: 46 . |39 aa |MW at 3983.4kD 


NOVlOc, 

278689868 Protein Sequence 


TG SEVGED C DCG SLKECTNLCCEALTCKLKPGTDCGLEG 



Sequence comparison of the above protein sequences yields the following sequence 
relationships shown in Table 10B. 



Table 10B. Comparison of NOVlOa against NOVlOb and NOVlOc. 


Protein Sequence 


NOVlOa Residues/ 
Match Residues 


Identities/ 

Similarities for the Matched Region 


NOVlOb 


1..228 
5..232 


228/228(100%) 
228/228(100%) 


NOVlOc 


187..219 
4.36 


33/33 (100%) 
33/33 (100%) 



138 



WO 03/029424 PCT/US02/31373 



Further analysis of the NOVlOa protein yielded the following properties shown in 
Table IOC. 



5 



Table 10C. Protein Sequence Properties NOVlOa 


PSort analysis: 


0.8200 probability located in outside; 0.1900 probability located in lysosome 
(lumen); 0.1000 probability located in endoplasmic reticulum (membrane); 
0.1000 probability located in endoplasmic reticulum (lumen) 


SignalP analysis: 


Cleavage site between residues 33 and 34 



A search of the NOVlOa protein against the Geneseq database, a proprietary 
database that contains sequences published in patents and patent publication, yielded 
10 several homologous proteins shown in Table 10D. 



Table 10D. Geneseq Results for NOVlOa 


Geneseq 
Identifier 


Protein/Organism/Length 
[Patent #, Date] 


NOVlOa 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for the 
Matched Region 


Expect 
Value 


AAW75769 


Human metalloproteinase 
BS 10.55 - Homo sapiens, 
470 aa. [W09839421-A2, 
ll-SEP-1998] 


1..157 
1..157 


157/157 (100%) 
157/157 (100%) 1 


7e-90 


AAW28509 


Product of clone J5 - Homo 
sapiens, 470 aa. 
[WO9707198-A2, 
27-FEB-1997] 


1..157 
1-157 


157/157 (100%) 
157/157(100%) 


7e-90 


AAB53240 


Human colon cancer antigen 
protein sequence SEQ ID 
NO:780 - Homo sapiens, 110 
aa. [WO200055351-A1, 
21-SEP-2000] 


153..228 
35.110 


73/76 (96%) 
74/76(97%) 


7e-41 


ABB 11929 


Human eMDC D protein 
homologue, SEQ ID 
NO:2299 - Homo sapiens, 
788 aa. [WO200157188-A2, 
09-AUG-2001] 


18..159 
18.. 153 


71/142(50%) 
99/142(69%) 


2e-32 


AAW90865 


Hunan ADAM protein #4 - 
Homo sapiens, 775 aa. 
[WO200014227-A1, 
16-MAR-2000] 


18..159 
5.. 140 


71/142(50%) 
99/142(69%) 


2e-32 



139 



WO 03/029424 



PCT/US02/31373 



In a BLAST search of public sequence datbases, the NOVlOa protein was found to 
have homology to the proteins shown in the BLASTP data in Table 10E. 



Table 10E. Public BLASTP Results for NOVlOa 


Protein 

Accession 

Number 


Protein/Organism/Length 


NOVlOa 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for the 
Matched Portion 


Expect 
Value 


015204 


Disintegrin-protease - Homo 
sapiens (Human), 470 aa. 


1..157 
1..157 


157/157(100%) 
157/157(100%) 


2e-89 


Q9R0X2 


Disintegrin metalloprotease 
precursor - Mus musculus 
(Mouse), 467 aa. 


1..157 
1..157 


104/157 (66%) 
124/157 (78%) 


8e-56 


Q9XSL6 


ADAM 28 precursor (EC 
3.4.24.-) (A disintegrin and 
metalloproteinase domain 28) 
(eMDCH)-Macaca 
fascicularis (Crab eating 
macaque) (Cynomolgus 
monkey), 776 aa. 


14..159 
1-141 


70/146 (47%) 
101/146 (68%) 


le-32 


E1262181 


SEQUENCE 3 FROM 
PATENT WO9709430- 
unidentified, 530 aa. 


18..159 
5.. 140 


71/142 (50%) 
99/142 (69%) 


5e-32 


Q9UKQ2 


ADAM 28 precursor (EC 
3.4.24.-) (A disintegrin and 
metalloproteinase domain 28) 
(Metalloproteinase-like, 
disintegrin-like, and cysteine- 
rich protein-L) (MDC-L) 
(eMDC 11) (ADAM23) - 
Homo sapiens (Human), 775 
aa. 


18.. 159 
5.. 140 


71/142 (50%) 
99/142(69%) 


5e-32 



PFam analysis predicts that the NOVlOa protein contains the domains shown in the 
Table 10F. 



Table 10F. Domain Analysis of NOVlOa 



140 



WO 03/029424 



PCT/US02/31373 



Pfam Domain 


NOVlOa Match 
Region 


Identity » ua " 
Similarities 
for the Matched 
Region 


J IK./ JS:IL"3T7" 
Expect Value 


Pep^M12B_propep 


90..201 


32/119(27%) 
79/119(66%) 


1.8e-20 


disintegrin 


187..219 


20/33 (61%) 
26/33 (79%) 


4e-14 



Example 11. 

The NOV1 1 clone was analyzed, and the nucleotide and encoded polypeptide 
sequences are shown in Table 1 1 A. 



Table 11 A. NOV11 Sequence Analysis 




SEQIDNO:47 |484bp 




NOVlla, 
CG1441 12-01 
DNA Sequence 


ACTGGGTCCGAATCAGTAGGTGACCCCGCCCCTGGATTCTGGAAGACCTCACCATGGGACGCCCCCG 


ACCTCGTGCGGCCAAGACGTGGATGTTCCTGCTCTTGCTGGGGGGAGCCTGGGCAGGAAATACACAG 
T ACG C C TGGG AGAC CACAGC CTACAGAAT AAAG ATGG C CC AG AAGTG CAGTC C C CG AG AG AATTTTC 
CTGACACTCTCAACTGTGCAGAAGTAAAAATCTTTCCCCAGAAGAAGTGTGAGGATGCTTACCCGGG 
GCAGATCACAGATGGCATGGTCTGTGCAGGCAGCAGCAAAGGGGCTGACACGTGCCAGGGCGATTCT 
GGAGGCCCCCTGGTGTGTGATGGTGCACTCCAGGGCATCACATCCTGGGGCTCAGACCCCTGTGGGA 
GGTCCGACAAACCTGGCGTCTATACC^CATCTGCCGCTACCTGGACTGGAT<^GAAGATC71TAGG 
C AGCAAGGGC TGATT 




ORF Start: ATG at 54 j |ORF Stop: TGA at 480 





SEQ ID NO: 48 jl42 aa |MW at 15404.5WD 


NOVlla, 
CG1441 12-01 
Protein Sequence 


MGRPR PRAAKTWMFLLLLGGAWAGNTQ YAWETTAYRIKMAQKCS PRENFPDTLNC AEVKI F PQKKCE 
DAYPGQITDG]WCAGSSKGADTCQGDSGGPLVCnX3ALQGITSWGSDPCGRSDKPGVyTNICRYLDWI 
KKIIGSKG 





SEQ ID NO: 49 |288 bp 




NOVllb, 
CG1441 12-04 
DNA Sequence 


CCCCGCCCCTGGATTCTGGAAGACCTCACCATGGGACGCCCC 


CGACCTCGTGCGGCCAAGACGTGGA 
ATTCTGGAGGCCCCCTGGTGTGTGA 
TGGGAGGTCCGACAAACCTGGCGTC 
ATAGGCAGCAAGGGCTOATTCTAGG 


TGTTCCTGCTCTTGCTGGGGGGAGCCTGGGCAGGGCAGGGCG 
TGGTGCACTCCAGGGCATCACATCCTGGGGCTCAGACCCCTG 
TATACCAACATCTGCCGCTACCTGGACTGGATCAAGAAGATC 
ATAAGCAC TAGATCTCCCTT 




ORF Start: ATG at 31 | 


ORF Stop: TGA at 259 



141 



WO 03/029424 



PCT7US02/31373 
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SEQ ID NO: 50 


76 aa |mW at 81103kD 


NOVllb, 
CG1441 12-04 
Protein Sequence 


MGRPRPRAAKTWMFLLLLGGAWAGQGDSGG PLVCDGALQGI TSWGSDPCGRSDKPGVYTNI CRYLDW 
IKKIIGSKG 






SEQ ED NO: 51 


|445 bp J 


NOVllc, 
255501898 DNA 
Sequence 


CACCAAGCTTATGGGACGCCCCCGACCTCGTGCGGCCAAGACGTGGATGTTCCTGCTCTTGCTGGGG 
GGAGCCTGGGCAGGAAATACACAGTACGCCTGGGAGACCACAGCCTACAGAATAAAGATGGCCCAGA 
AGTGCAGTCCCCGAGAGAATTTTCCTGACACTCTCAACTGTGCAGAAGTAAAAATCTTTCCCCAGAA 
GAAGTGTGAGGATGCTTACCCGGGGCAGATCACAGATGGCATGGTCTGTGCAGGCAGCAGCAAAGGG 
GCTGACACGTGCCAGGGCGATTCTGGAGGCCCCCTGGTGTGTGATGGTGCACTCCAGGGCATCACAT 
CCTGGGGCTCAGACCCCTGTGGGAGGTCCGACAAACCTGGCGTCTATACCAACATCTGCCGCTACCT 
GGACTGGATCAAGAAGATCATAGGCAGCAAGGGCCTCGAGGGC 




ORF Start: at 2 


|ORF Stop: end of sequence 
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SEQ ID NO: 52 |l48aa 


MW at 16046.2kD 


NOVllc, 
255501898 
Protein Sequence 


TKLMGRPRPRAAKTWMFLLLLGGAWAGNTQYAWETTAYEIKMAQKCSPRENPPDTLNCAEVKIFPQK 
KCEDAYPGQITTCMVCAGSSKGADTCQGDSGGPLVCIX3ALQGITSWGSDPCGRSDKPGVYTNICRYL 
DWIKKIIGSKGLEG 



15 





SEQ ID NO: 53 


358 bp | 


NOVlld, 
255612524 DNA 
Sequence 


CACCAAGCTTGGAAATAC ACAGTACGCC TGGGAG AC C ACAGCC TACAGAATAAAG ATGGCCCAGAAG 
TGCAGTCCCCGAGAGAATTTTCCTGACACTCTCAACTGTGGAGAAGTAAAAATCTTTCCCCAGAAGA 
AGTGTGAGGATGCTTACCCGGGGCAGATCACAGATGGCATGGTCTGTGCAGGCAGCAGCAAAGGGGC 


TGGGGCTCAGACCCCTGTGGGAGGTCCGACAAACCTGGCGTCTATACCAACATCTGCCGCTACCTGG 
ACTGGATCAAGAAGCTCGAGGGC 




ORF Start: at 2 


ORF Stop: end of sequence 
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SEQ ID NO: 54 Jl 19 aa |MW at 12908.4kD 


NOVlld, 
255612524 
Protein Sequence 


TKLGNTQ YAWETTAYRI KMAQKC SPREOTPDTL^ 

DTCQGDSGGPLVCIX3ALQGITSWGSDPCGRSDKPGVymrCRYIJ>WIKKLEG 



25 



142 



WO 03/029424 
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SEQ ID NO: 55 |307 bp | 


NOVlle, 
255612566 DNA 
Sequence 


CACCAAGCTTCAGAAGTGCAGTCCCCGAGAGAATTTTCCTGACACTCTCAACTGTGCAGAAGTAAAA 
ATCTTTCCCCAGAAGAAGTGTGAGGATGCTTACCCGGGGCAGATCACAGATGGCATGGTCTGTGCAG 
GCAGCAGCAAAGGGGCTGACACGTGCCAGGGCGATTCTGGAGGCCCCCTGGTGTGTGATGGTGCACT 
CCAGGGCATCACATCCTGGGGCTCAGACCCCTGTGGGAGGTCCGACAAACCTGGCGTCTATACCAAC 
ATC TGCCGC TACCTGGAC TGGATC AAGAAGCTCGAGGGC 




ORF Start: at 2 jORF Stop: end of sequence 





SEQ ID NO: 56 


102 aa |MW at 10922.2kD 


NOVlle, 
255612566 
Protein Sequence 


TKLQKC S PRENFPDTLNC AEVK I F PQKKCEDA YPGQ I TDGMVC AGS SKGADTCQGDSGGPLVCDGAL 
QGITSWGSDPCGRSDKPGVYTNICRYLDWIKKLEG 





SEQ ID NO: 57 |l78 bp 




NOV 1 If, 
306434072 DNA 
Sequence 


CACCGGATCCGGGCAGGGCGATTCTGGAGGCCCCCTGGTGTGTGATGGTGCACTCCAGGGCATCACA 
TCCTGGGGCTCAGACCCCTGTGGGAGGTCCGACAAACCTGGCGTCTATACCAACATCTGCCGCTACC 
TGGAC TGG ATCAAG AAGATCATAGGCAGC AAGGGCCTCGAGGGC 




ORF Start: at 2 foRF Stop: end of sequence 





SEQ ID NO: 58 |59aa [MW at 6072.7kD 


NOV 1 If, 
306434072 
Protein Sequence 


TGSGQGDSGGPLVCDGALQGITSWGSDPCGRSDKPGVYTNICRYLDWIKKIIGSKGLEG 





SEQ ID NO: 59 |436bp j 


NOVllg, 
CG1441 12-02 
DNA Sequence 


AGTGTGCTGGAATTCGCCCTTACTGGGTCCGAATCAGTAGGTGACCCCGCCCCTGGATTCTTGAAGA 


CCTCACCATGGGACGCCCCCGACCTCGTGCGGCCAAGACGTGGATGTTCCTGCTCTTGCTGGGGGGA 

GCCTGGGCAGAGAATTTTCCTGACACTCTCAACTGTGCAGAAGTAAAAATCTTTCCCCAGAAGAAGT 

GTGAGGATGCTTACCCGGG<3CAGATCACAGATGGCATGGTCTGTGCAGGCAGCAGCAAAGG 

CACGTGCCAGGGCGATTCTGGAGGCCCCCTGGTGTGTGATGGTGCACTCCAGGGCATCACATCCTG^ 

GGCTCAGACCCCTGTGGGAGGTCCGACAAACCTGGCGTCTATACCAACATCTGCCGCTACCTGGACT 

GGATCAAGAAGATCATAGGCAGCAAGGGCTGATT 




ORF Start: ATG at 75 | |ORF Stop: TGA at 432 
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SEQ ID NO: 60 Jll9aa 


MW at 12718.4kD 


NOVllg, 
CG1441 12-02 
Protein Sequence 


MGRPRPRAAKTWMFLLLLGGAWAENFPDTLNCflEVKIFPQKKCEDAYPGQITDGMVCAGSSKGADTC 
QGDSGGPLVCDGALQGITSWGSDPCGRSDKPGVYTNICRYLDWIKKIIGSKG 



5 





SEQ ID NO: 61 |845bp | 


NOVllh, 
CG144112-03 
DNA Sequence 


CGCCCTTACTGGGTCCGAATCAGTAGGTGACCCCGCCCCTGGATTCTGGAAGACCTCACCATGGGAC 


GCCCCCGACCTCGTGCGGCCAAGACGTGGATGTTCCTGCTCTTGCTGGGGGGAGCCTGGGCAGGACA 
CTCCAGGGCACAGGAGGACAAGGTGCTGGGGGGTCATGAGTGCCAACCCCATTCGCAGCCTTGGCAG 
GCGGCCTTGTTCCAGGGCCAGCAACTACTCTGTGGCGGTGTCCTTGTAGGTGGCAACTGGGTCCTTA 
CAGCTGCCCACTGTAAAAAACCGAAATACACAGTACGCCTGGGAGACCACAGCCTACAGAATAAAGA 
TGGCCC AG AGC AAG AAATACC TGTGGTTC AGTCCATCCC AC ACCCC TGC TAG AACAGCAGCGATGTG 
GAGGACCACAACCATGATCTGATGCTTCTTCAACTGCGTGACCAGGCATCCCTGGGGTCCAAAGTGA 
AGCCCATCAGCCTGGCAGATCATTGCACCCAGCCTGGCCAGAAGTGCACCGTCTCAGGCTGGGGCAC 
TGTCACCAGTCCCCGAGAGAATTTTCCTGACACTCTCAACTGTGCAGAAGTAAAAATCTTTCCCCAG 
AAGAAGTGTGAGGATGCTTACCCGGGGCAGATCACAGATGGCATGGTCTGTGCAGGCAGCAGCAAAG 
GGGCTGACACGTGCCAGGGCGATTCTGGAGGCCCCCTGGTGTGTGATGGTGCACTCCAGGGCATCAC 
ATCCTGGGGCTCAGACCCCTGTGGGAGGTCCGACAAACCTGGCGTCTATACCAACATCTGCCGCTAC 
CTGGACTGGATCAAGAAGATCATAGGCAGCAAGGGCTGATT 




ORF Start: ATG at 61 | |ORF Stop: TGA at 841 
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SEQ ID NO: 62 |260 aa |MW at 28047.6kD 


NOVllh, 
CG1441 12-03 
Protein Sequence 


MGRPRPRAAKTV^LLLLGGAWAGHSRAQEDKVLGGHECQPHSQPWQAALFQGQQLLCGGVLVGGNW 
YLTAAHCKKPKYTVRLGDHSLQNKDGPEQE I PWQS IPHPCYNSSDVEDHNHDLMLLQLRDQASLGS 
KVKP I SLADHC TQPGQKCTVSGWGTVTS PRENF PDTLNC AEVK I F PQKKCEDAY PGQ I TDGMVCAGS 
SKGADTCQGDSGGPLVCDGALQGI TSWGSDPCGRSDKPGVYTNICRYLDWIKKI IGSKG 



15 

Sequence comparison of the above protein sequences yields the following sequence 
relationships shown in Table 1 IB. 



Table 11B. Comparison of NOVlla against NOVllb through NOVllh. 


Protein Sequence 


NOVlla Residues/ 
Match Residues 


Identities/ 

Similarities for the Matched Region 


NOVllb 


97.. 142 
31. .76 


46/46 (100%) 
46/46(100%) 


NOVllc 


1..142 
4.. 145 


142/142 (100%) 
142/142 (100%) 


NOV lid 


24.. 139 
4.. 119 


114/116 (98%) 
115/116(98%) 



144 



WO 03/029424 



PCT/US02/31373 



NOVlle 


41..139 i 
4.. 102 


j;rj/ y s o K / 3 :i 3 '7 : 

97/99T97%) 
98/99 (98%) 


NOV 1 If 


91..142 
5..56 


52/52 (100%) 
52/52 (100%) 


NOVllg 


1..142 
1..119 


119/142 (83%) 
119/142(83%) 


NOVllh 


44.. 142 
162..260 


99/99(100%) 
99/99 (100%) 



Further analysis of the NOVlla protein yielded the following properties shown in 
Table 11C. 

5 



Table 11C. Protein Sequence Properties NOVlla 


PSort analysis: 


0.3700 probability located in outside; 0.1000 probability located in 
endoplasmic reticulum (membrane); 0.1000 probability located in 
endoplasmic reticulum (lumen); 0.1000 probability located in lysosome 
(lumen) 


SignalP analysis: 


Cleavage site between residues 24 and 25 



A search of the NOV1 la protein against the Geneseq database, a proprietary 
10 database that contains sequences published in patents and patent publication, yielded 
several homologous proteins shown in Table 1 ID. 
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Table 11D. Geneseq Results for NOVlla 


Geneseq 
Identifier 


Protein/Organism/Length 
[Patent #, Date] 


NOVlla 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for 
the Matched 
Region 


Expect 
Value 


ABP41332 j 


Human ovarian antigen ; 
HCOQP78, SEQ ID NO:2464 
- Homo sapiens, 315 aa. 
[WO200200677-A1, 
03-JAN-2002] 


44..142 

217..315 i 


99/99(100%) 
99/99(100%) 


3e-57 


AAU81959 


Human PR0322 - Homo 
sapiens, 260 aa. 
[WO200109327-A2, 
08-FEB-2001] 


44..142 
162..260 


99/99 (100%) 
99/99(100%) 


3e-57 




tinman PI? CYXIO T\rc\tf*\r\ 

xiuiiidJi ri\vjzz. pxuiciii 

sequence SEQ ID NO:72 - 
Homo sapiens, 260 aa. 
[WO200200690-A2, 
03-JAN-2002] 


AA 

*r*t. . i 

162..260 


99/99 (100%) 


3e-57 


ABB95458 


Human angiogenesis related 
Drotein PR0322 SEO ID NO* 
72 - Homo sapiens, 260 aa. 
[WO200208284-A2, 
31-JAN-2002] 


44.. 142 
162 260 


99/99 (100%) 
99/99 (100%) 


3e-57 


AAB53087 


Human 

angiogenesis-associated 
protein PR0322, SEQ ID 
NO: 127 - Homo sapiens, 260 
aa. [WO200053753-A2, 
14-SEP-2000] 


44.. 142 
162..260 


99/99(100%) 
99/99 (100%) 


3e-57 



In a BLAST search of public sequence datbases, the NOV1 la protein was found to 
5 have homology to the proteins shown in the BLASTP data in Table 1 IE. 



Table HE. Public BLASTP Results for NOVlla 


Protein 

Accession 

Number 


Protein/Organism/Length 


NOVlla 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for 
the Matched 
Portion 


Expect 
Value 


Q9NR68 


Serine protease 
kallikrein/ovasin/neuropsin 
type 3 - Homo sapiens 
(Human), 119aa. 


L.142 
1..119 


119/142(83%) 
119/142(83%) 


9e-66 



146 



WO 03/029424 



PCT/US02/31373 



060259 


Neuropsin precursor (EC 
3.4.21.-) (NP) (Kallikrein 8) 
(Ovasin) (Serine protease 
TADG-14) 
(Tumor-associated 
differentially expressed 
gene-14 protein) - Homo 
sapiens (Human), 260 aa. 


44.. 142 H 
162..260 


99/99 (100%) 




088780 


Neuropsin precursor (EC 
3.4.21.-) (NP) (Kallikrein 8) 
(Brain serine protease 1) - 
Rattus norvegjcus (Rat), 260 
aa. 


38.. 141 
147..259 


80/113(70%) 
93/113(81%) 


8e-45 


BAB92021 


Neuropsin - Mus musculus 
(Mouse), 176 aa (fragment). 


38.. 141 
63.. 175 


81/113 (71%) 
92/113(80%) 


le-44 


Q61955 


Neuropsin precursor (EC 
3.4.21.-) (NP) (Kallikrein 8) - 
Mus musculus (Mouse), 260 
aa. 


38.. 141 
147..259 


81/113(71%) 
92/113 (80%) 


le-44 



PFam analysis predicts that the NOV1 la protein contains the domains shown in the 
Table 1 IF. 
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Table 11F. Domain Analysis of NOVlla 


Pfam Domain 


NOVlla Match Region 


Identities/ 
Similarities 

for the Matched Region 


Expect Value 


trypsin 


49.. 134 


47/101 (47%) 
76/101 (75%) 


5.5e-40 



Example 12. 

10 The NOV12 clone was analyzed, and the nucleotide and encoded polypeptide 

sequences are shown in Table 12A. 



Table 12A. NOV12 Sequence Analysis 




SEQIDNO:63 |l536 bp 




NOV12a, 
CG144497-01 
DNA Sequence 


AAGAGCCAAGCCAGCATGTCGGGGACCCGAGCC TCCAACGACCG(^CCCCCGnrnr anfirnftrnTP n 
AGCGGGGGCGGCTGCAGCAGGAGGCGGCGGCGACCGGCTCCCGCGTGACGGTGGTGCTGGGCGCGCA 
GTGGGGGGACGAGGGCAAAGGCAAGGTGGTGGACCTGCTGGCCACGGACGCCGACATCATCAGCCGC 
TGCCAGGGGGGCAACMCGCCGGCCACACGGl^TGGTGGATGGGAAAGAGTACGACTTCCACCTGC 
TGCCCAGCGGCATCATCAACACCAAGGCCGTGTCCTTCATTGGTAACGGGGTGGTC 



147 



WO 03/029424 



PCT/US02/31373 





ATCTCTGACAGAGCCCACCTTGTGTTTGATTTTC 

GCCAGGCACAAGAGGGGAAGAGTATAGGCACCACCAAGAAGGGAATCGGACCAACCTACTCTTCCAA 
AGCTGCCCGGACAGGCCTCCGCATCTGCGACCTCCTGTCAGATTTTGATGAGTTTTCCTCCAGATTC 
AAGAACCTGGCCCACCAGCACCAGTCGATGTTCCCCACCCTGGAAATAGACATTGAAGGCCAACTCA 
AAAGGCTCAAGGGCTTTGCTGAGCGGATCAGACCCATGGTCCGAGATGGTGTTTACTTTATGTATGA 
GGCACTCCACGGCCCCCCCAAGAAGATCCTGGTGGAGGGTGCCAACGCCGCCCTCCTCGACATTGAC 
TTCGGTACCTACCCCTTTGTGACTTCATCCAACTGCACCGTGGGCGGTGTGTGCACGGGCCTGGGCA 
TCCCCCCGCAGAACATAGGTGACGTGTATGGCGTGGTGAAAGCCTATACCACACGTGTGGGCATCGG 
GGCCTTCCCCACCGAGCAGATCAACGAGATTGGAGGCCTGCTGCAGACCCGCGGCCACGAGTGGGGA 
GTG ACC AC AGGCAGGAAGAGGCGCTGCGGC TGGCTCGAC CTGATG ATTCTAAGATATGCTC ACATGG 
TCAACGGATTCACTGCGCTGGCCCTGACGAAGCTGGACATCCTGGACGTACTGGGTGAGGTTAAAGT 
CGGTGTC T C AT AC AAGCTGAAC GGGAAAAGGATTCCCT ATT TC C C AGC TAAC C AGG AGATG C TTC AG 
AAGGTCGAAGTTGAGTATGAAACGCTGCCTGGGTGGAAAGCAGACACCACAGGCGCCAGGAGGTGGG 
AGGACCTGCCCCCACAGGCCCAGAACTACATCCGCTTTGTGGAGAATCACGTGGGAGTCGCAGTCAA 
ATGGGTTGGTGTTGGCAA^T^A^^^^AGTCGATGATCCAGCTGTTTTAGTCACAGACTGAGCTGATC 
CCAACAGGCCCTGGCAGCGTCTGGACTTGTGTAAACAGCAGCAGTCACGTTCCTCGGCCGCCACAAC 


CAACACCAAAGCAGGAAAACCATOTTCTGTACTTT^^^ 




ORF Start: ATG at 16 f ]ORF Stop: TAG at 1387 





SEQ ID NO: 64 |457 aa |MW at 5018L0kD 


NOV12a, 
CG144497-01 
Protein Sequence 


MSGTRASITORPPGAGGVKRGRLQQEAAATGSRVTVVLGAQWGDEGKGKVV^ I SRCQGGN 
NAGHTVWDGKEYDFHLLPSGI INTKAVSFIGNGVVIHLPGLFEEAEKN33KKGLKDWEKRLI ISDRA 
HLVFDFHQAVDGLQEVQRQAQEGKS IGTTKKGIGPTYS SKA^RTGLRICDLL SDFDEF S SRFKNL AH 
QHQSMF PTLEI DIEGQLKRLKGFAEKIRPMVRDGVYFMYEALHGP PKK I LVEGANAALLDIDFGTY P 
FVTS SNCTVGGVCTGIiGI PPQNIGDVYGWKAYTTRVGIGAFPTEQINE IGGLLQTRGHEWGVTTGR 
KRRCGWLDLMILRYAHMVNGFTALALTKLDILDV^ 
YETLPGWKADTTGARRWEDLPPQAQNYIRFVENHV 



Further analysis of the NOV12a protein yielded the following properties shown in 
Table 12B. 
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Table 12B. Protein Sequence Properties NOV12a 


PSort analysis: 


0.5946 probability located in microbody (peroxisome); 0.3000 probability 
located in nucleus; 0.2377 probability located in lysosome (lumen); 0.1000 
probability located in mitochondrial matrix space 


SignalP analysis: 


No Known Signal Sequence Predicted 



A search of the NOV12a protein against the Geneseq database, a proprietary 
15 database that contains sequences published in patents and patent publication, yielded 
several homologous proteins shown in Table 12C. 



Table 12C. Geneseq Results for NOV12a 


Geneseq 
Identifier 


Protein/Organism/Length 
[Patent #, Date] 


NOV12a 

Residues/ 


Identities/ 

Similarities f nr 


Expect 
Value 



148 



WO 03/029424 



PCT/US02/31373 







H 8 "" 

Match 

Residues 


ue Matched 
Region 




AAB41627 


Human ORFXORF1391 
polypeptide sequence SEQ 
ID NO:2782 - Homo sapiens, 
314 aa. [WO200058473-A2, 
05-OCT-2000] 


144..457 
1..314 


313/314 (99%) 
314/314 (99%) j 


0.0 


ABB70971 


Drosophila melanogaster 
polypeptide SEQ ID NO 
39705 - Drosophila j 
melanogaster, 447 aa. 
[WO200171042-A2, 
27-SEP-2001] 


31..456 
24..446 


270/427 (63%) 
338/427 (78%) 


e-161 


AAY95049 


Candida albicans polypeptide 
sequence # 17 - Candida 
albicans, 412 aa. 
rEP982401-A2 
01-MAR-2000] 


35..455 
4..409 


227/425 (53%) 
306/425 (71%) 


e-130 


AAU23499 


Novel human enzyme 
polypeptide #585 - Homo 
sapiens, 209 aa. 
[WO200155301-A2, 
02-AUG-2001] 


249..457 
1..209 


208/209 (99%) 
209/209 (99%) 


e-121 


AAW99455 


Maize adenylosuccinate 
synthetase - Zea mays, 484 
aa. [US5882869-A, 
16-MAR-1999] 


24..454 
53..482 


217/436 (49%) 
310/436 (70%) 


e-119 



In a BLAST search of public sequence datbases, the NOV12a protein was found to 
have homology to the proteins shown in the BLASTP data in Table 12D. 
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Table 12D, Public BLASTP Results for NOV12a 


Protein 

Accession 

Number 


Protein/Organism/Length 


NOV12a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for 
the Matched 
Portion 


Expect 
Value 


BAC04649 


CDNA FLJ38602 fis, clone 
HEART2003836, highly 
similar to 

ADENYLOSUCCINATE 
SYNTHETASE, MUSCLE 
ISOZYME (EC 6.3.4.4) - 
Homo sapiens (Human), 457 
aa. 


L.457 
1..457 


456/457 (99%) 
457/457 (99%) 


0.0 



149 



WO 03/029424 



PCI7US02/31373 



P28650 


Adenylosuccinate synthetase, 
muscle i<;o7vme fEC 6 3 4 4^ 
(IMP— aspartate ligase) 
(ADSS) (AMPSASE) - Mus 
musculus (Mouse), 457 aa. 


1..457 B 
1..457 


453/457 (98%) 


JM^t Utilu. 4F»k .1^ U H 

0.0 


AJMSDS 


adenylosuccinate synthase 
(EC 63.4.4), muscle - mouse, 
452 aa. 


1..425 
1..425 


411/425(96%) 
421/425 (98%) 


0.0 


AAH32039 


Similar to 

ADENYLOSUCCINATE 
SYNTHETASE, MUSCLE 
ISOZYME 
(IMP-ASPARTATE 
LIGASE) (ADSS) 
(AMPSASE) - Homo sapiens 
(Human), 502 aa (fragment). 


64.457 
109..502 


392/394 (99%) 
394/394 (99%) 


0.0 


Q9CQL9 


Adenylosuccinate synthetase 
(EC 6.3.4.4) (IMP-aspartate 
ligase) (ADSS) (AMPSase) - 
Mus musculus (Mouse), 456 
aa. 


8..457 
4..456 


345/453 (76%) 
399/453 (87%) 


0.0 



PFam analysis predicts that the NOV12a protein contains the domains shown in the 
Table 12E. 



Table 12E. Domain Analysis of NOV12a 


Pfam Domain 


NOV12a Match 
Region 


Identities/ 
Similarities 

for the Matched Region 


Expect Value 


AldJCan_dh_C 


396..411 


8/16(50%) 
14/16 (88%) 


0.43 


Adenylsucc_synt 


32..455 


261/431 (61%) 
417/431 (97%) 


0 



Example 13. 

The NOV13 clone was analyzed, and the nucleotide and encoded polypeptide 
sequences are shown in Table 13 A. 



[Table 13A, NOV13 Sequence Analysis 
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SEQ ID NO: 65 


|278bp I 1 


NOV13a, 
CG144686-01 
DNA Sequence 


TGCCTGTGGGTTTGATTGCTACCACTCTTGCAATTGCTCCTGTCCGCTTTGACAGGGAGAAGGTGTT 
CCGCGTGAAGCCTCAGGATGAAAAACAAGCAGACATCATAAAGGACTTGGCCAAAACCAGTGAGCTC 
CGAGATAAAGGCAAATTTGGTTTTCTCCTTCCAGAATCCCGGATAAAGCCAACGTGCAGAGAGACCA 
TY^PTa^rTnTTAAATTTATTOCrAAriTATATCCTCAAGCATACTTCCTAAAGAACTGCCCTCTGTTT 
GGAATAAGCC 




ORF Start: at 3 


| |ORFStop:TAAat249 





SEQ ID NO: 66 |82 aa |MW at 9327.9kD 


NOV13a, 
CG144686-01 
Protein Sequence 


PVGLIATTLAIAPVRFDREKWRVKPQDEKQADII^ 
LAVKFIAKYILKHTS 





SEQ ID NO: 67 j268bp | 


NOV13b, 
278690008 DNA 
Sequence 


CACCGGATCCACCCCTGTGGGTTTGATTGC TAC C ACTCTTGCAATTGCTCC TGTCCGCTTTGACAGG 
GAGAAGGTGTTCCGCGTGAAGCCTCAGGATGAAAAACAAGCAGACATCATAAAGGACTTGGCCAAAA 
CCAGTGAGCTCCGAGATAAAGGCAAATTTGGTTTTCTCCTTCCAGAATCCCGGATAAAGCCAACGTG 
CAGAGAGAC CATGC TAGCTGTC AAATTTATTGCCAAGTATATCCTCAAGCATACTTCCCTCGAGGGC 




ORF Start: at 2 |ORF Stop: end of sequence 






SEQ ID NO: 68 |89 aa JMW at 9973.6kD 


NOV13b, 
278690008 
Protein Sequence 


TGSTPVGLIATTI^IAPVRFDREKVFRVKPQDEKQADIIKDIAKTSELRDKGKFGFLLPESRIKPTC 
RETMLAVKFIAKYILKHTSLEG 





SEQ ID NO: 69 |94bp | 


NOV13c, 
278690035 DNA 
Sequence 


CACCGGATCCACCAGTGAGCTCCGAGATAAAGGCAAATTTGGTTTTCTCCTTCCAGAATCCCGGATA 
AAGCCAACGTGCAGAGAGCTCGAGGGC 




ORF Start: at 2 JoRF Stop: end of sequence 





SEQ ID NO: 70 |31 aa |MW at 3452.9kD 


NOV13c, 


TGSTSELRDKGKFGFLLPESRIKPTCRELEG 



151 



WO 03/029424 



PCT/US02/31373 



278690035 Protein Sequence 



NOV13d, 
CG144686-02 
DNA Sequence 



SEQIDNO:71 



11622 bp 



ATOAGGCTCATCCTGCCTGTGGGTTTGATTGCTACCACTCTTGCAATTGCTCCTGTCCGCTTTGACA 
GGGAGAAGGTGTTCCGCGTGAAGCCCCAGGATGAAAAACAAGCAGACATCATAAAGGACTTGGCCAA 
AACCAATGAGCTTGACTTCTGGTATCCAGGTGCCACCCACCACGTAGCTGCTAATATGATGGTGGAT 
TTCCGAGTTAGTGAGAAGGAATCCCAAGCCATCCAGTCTGCCTTGGATCAAAATAAAATGCACTATG 
AAATC T TGATTCATGATCTAC AAGAAGAGATTG AG AAACAGTTTGATGTTAAAGAAG ATATC C CAGG 
C AGGC AC AGC T ACGC AAAAT AC AAT AATTGGGAAAAG ATTGTGGC T TG G AC TGAAAAG ATG ATGGAT 
AAGT ATC CTG AAATGGTCTC TCGTATTAAAATTGGATC TACTGTTGAAGATAATCC ACTATATGTTC 
TGAAGATTGGGGAAAAGAATGAAAGAAGAAAGGCTATTTTTATGGATTGTGGCATTCACGCACGAGA 
ATGGG TC TCCCCAGCATTC TGCCAGTGGTTTGTCT ATCAGGC AACC AAAACTTATGGGAGAAAC AAA 
ATTATGACCAAACTCTTGGACCGAATGAATTTTTACATTCTTCCTGTGTTCAATGTTGATGGATATA 
TTTGGTCATGGACAAAGAACCGCATGTGGAGAAAAAATCGTTCCAAGAACCAAAACTCCAAATGCAT 
CGG C ACTGACC TCAACAGG AATTTTAATGCTTC ATGGAACTCCATTCC TAAC ACCAATGACCC ATGT 
GCAGATAACTATCGGGGCTCTGCACCAGAGTCCGAGAAAGAGACGAAAGCTGTCACTAATTTCATTA 
GAAGCCACCTGAATGAAATCAAGGTTTACATCACCTTCCATTCCTACTCCCAGATGCTATTGTTTCC 
CTATGGATATACATCAAAACTGCCACCTAACCATGAGGACTTGGCCAAAGTTGCAAAGATTGGCACT 
GATGTTCTATCAACTCGATATGAAACCCGCTACATCTATGGCCCAATAGAATCAACAATTTACCCGA 
TATCAGGTTCTTCTTTAGACTGGGCTTATGACCTGGGCATCAAACACACATTTGCCTTTGAGCTCCG 
AGATAAAGGCAAATTTGGTTTTCTCCTTCCAGAATCCCGGATAAAGCCAACGTGCAGAGAGACCATG 
PTAGCT^rr'Aa^TT^TGCCAAGTATATCCTCAAGCATACTTCCTA AAGAACTGCCCTCTGTTTGG 



AATAAGCCAATTAATCCTTTTTTGTGCCTTTCATCAGAAAGTCAATCTTCAGTTATCCCCAAATGCA 



GCTTCTATTTCACCTGAATCCTTCTCTTGCTCATTTAAGTCCCATGTTACTGCTGTTTGCTTTTACT 



TACTTTCAGTAGCACCATAACGAAGTAGCTTTAAGTGAAACCTTTTAACTACCTTTCTTTGCTCCAA 



GTGAAGTTTGGACCCAGCAGAAAGCATTATTTTGAAAGGTGATATACAGTGGGGCACAGAAAACAAA 



TGAAAACCCTCAGTTTCTCACAGATTTTCACCATGTGGCTTCATCAATTTATGTGCTAATACAATAA 



ORF Start: ATG at 1 



ORF Stop: TAA at 1252 





SEQ ID NO: 72 |417 aa |MW at 48699.4kD 


NOV13d, 
CG144686-02 
Protein Sequence 


MRLILPVGLIATTLAIAPVRFDREKVFRVKPQDEKQADI IKDLAKTNELDFWYPGATHHVAANMMVI) 

FRVSEiCESQAIQSALDQNKMHYEILIHDLQEEIEKQFDVKEDIPGRH^ 

KYPEMVSRIKIGSTVEDNPIiVVIaKIGEKNERIUCAIFMDCGIHAREWVSPAFC 

IMTKLLDR>^YILPVFNVDGYIWSWTKNRMWRKNRSKNQNSKC IGTDLNRNFNASWNS I PNTNDPC 
ADNYRGSAPESEKETKAVTNFIRSHLNEIKOT 

DVLSTRYETRYI YGP IESTI YP I SGS SLDWAYDLG IKHTFAF ELRDKGKFGFLLPESRIKPTCRETM 
LAVKFIAKYILKHTS m . 



Sequence comparison of the above protein sequences yields the following sequence 
relationships shown in Table 13B. 



Table 13B. Comparison of NOV13a against NOV13b through NO V13d. 


Protein Sequence 


NOV13a Residues/ 
Match Residues 


Identities/ 

Similarities for the Matched Region 


NOV13b 


1..82 
5..86 


82/82(100%) 
82/82(100%) 
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NOV13c 


41..65 


25?25*TlW%) 




4..28 


25/25 (100%) 


NOV13d 


1..44 


43/44 (97%) 




6..49 


44/44 (99%) 



Further analysis of the NOV13a protein yielded the following properties shown in 
Table 13C. 
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Table 13C. Protein Sequence Properties NOV13a 


PSort analysis: 


0.5500 probability located in endoplasmic reticulum (membrane); 0.1900 
probability located in lysosome (lumen); 0.1000 probability located in 
endoplasmic reticulum (lumen); 0.1000 probability located in outside 


SignalP analysis: 


No Known Signal Sequence Predicted 



A search of the NOV13a protein against the Geneseq database, a proprietary 
10 database that contains sequences published in patents and patent publication, yielded 
several homologous proteins shown in Table 13D. 



Table 13D. Geneseq Results for NOV13a 


Geneseq 
Identifier 


Protein/Organism/Length 
[Patent #, Date] 


NOV13a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for 
the Matched 
Region 


Expect 
Value 


AAU84325 


Protein CPA3 differentially 
expressed in breast cancer 
tissue - Homo sapiens, 417 
aa. [WO200210436-A2, 
07-FEB-2002J 


1..44 
6..49 


43/44 (97%) 
44/44 (99%) 


2e-17 


AAG75369 


Human colon cancer antigen 
protein SEQ ID NO:6133 - 
Homo sapiens, 180 aa. 
[WO200122920-A2, 
05-APR-2001] 


43..82 
141..180 


40/40 (100%) 
40/40 (100%) 


9e-17 


AAU04477 


Porcine carboxypeptidase B 
(CpB) protein - Sus scrofa, 
306 aa. [WO200151624-A2, 
19-JUL-2001] 


41..80 
266..305 


25/40(62%) 
34/40 (84%) 


4e-10 


AAR75132 


Porcine carboxypeptidase B - 
Sus scrofa. 306 aa. 


41..80 
266..305 


25/40(62%) 
34/40 (84%) 


4e-10 
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rWO9514096-Al. 
26-MAY-1995] 








AAR75131 


Porcine Tyr-His-Met 
Procarboxypeptidase B - Sus 
scrofa, 404 aa. 
[WO9514096-A1, 
26-MAY-1995] 


41..80 
364..403 


25/40(62%) 
34/40(84%) 


4e-10 



In a BLAST search of public sequence datbases, the NOV13a protein was found to 
have homology to the proteins shown in the BLASTP data in Table 13E. 
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Table 13E. Public BLASTP Results for NOV13a 


Protein 

Accession 

Number 


Protein/Organism/Length 


NOV13a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for 
the Matched 
Portion 


Expect 
Value 


P15088 


Mast cell carboxypeptidase A 
precursor (EC 3.4.17.1) 
(MC-CPA) (Carboxypeptidase 
A3) - Homo sapiens (Human), 
417 aa. 


1..44 
6..49 


43/44(97%) 
44/44(99%) 


5e-17 


P97597 


Mast cell carboxypeptidase A 
precursor - Rattus norvegicus 
(Rat), 412 aa (fragment). 


43..82 
373..412 


37/40(92%) 
39/40(97%) 


le-14 


P21961 


Mast cell carboxypeptidase 
(EC 3.4.17.1) (RMC-CP) 
(Carboxypeptidase A3) - Rattus 
norvegicus (Rat), 309 aa. 


43..82 
270..309 


37/40(92%) 
39/40(97%) 


le-14 


P15089 


Mast cell carboxypeptidase A 
precursor (EC 3.4.17.1) 
(MC-CPA) (Carboxypeptidase 
A3) - Mus musculus (Mouse), 
417 aa. 


43..82 
378..417 


36/40(90%) 
39/40(97%) 


7e-14 


P00732 


Carboxypeptidase B (EC 
3.4.17.2) - Bos taurus (Bovine), 
306 aa. 


41..80 
266..305 


26/40(65%) 
36/40(90%) 


7e-ll 



PFam analysis predicts that the NOV13a protein contains the domains shown in the 
10 Table 13F. 
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Table 13F. Domain Analysis of NOV13a 


Pfam Domain 


NOV13a Match Region 


Identities/ 
Similarities 
for the Matched 
Region 


Expect Value 


Zn_carbOpept 


41..65 


16/30(53%) 
24/30 (80%) 


5.6e-08 



Example 14. 

The NOV14 clone was analyzed, and the nucleotide and encoded polypeptide 
sequences are shown in Table 14A. 



Table 14A. NOV14 Sequence Analysis 




SEQIDNO:73 |829bp | 


NOV14a, 
CG144906-01 
DNA Sequence 


GCCCTTCGCGGGAGAGGAGGCCATGGGCGCGCGCGGGGCGCTGCTGCTGGCGCTGCTGCTGGCTCGG 
GCTGGACTCAGGAAGCCGGAGTCGCAGGAGGCGGCGCCCTTATCAGGACCATGCGGCCGACGGGTCA 
TCACGTCGCGCATCGTGGGTGGAGAGGACGCCGAACTCGGGCGTTGGCCGTGGCAGGGGAGCCTGCG 
CCTGTGGGATTCCCACGTATGCGGAGTGAGCCTGCTCAGCCACCGCTGGGCACTCACGGCGGCGCAC 
TGCTTTGAAACC TATAGTGAC CTTAGTGATCCCTCCGGGTGG ATGGTCC AGTTTGGCCAGCTGACTT 
CC ATGCC ATCCTC C AC ATTTGAGTTTGAGAACCGGAC AGAC TGCTGGGTGAC TGGCTGGGGGTAC AT 
CAT^GAGGATGAGGCACTGCCATCTCCCCACACCCTCCAGGAAGTTCAGGTCGCCATCATAAACAAC 
TCTATGTGCAACCACCTCTTCCTCAAGTACAGTTTCCGCAAGGACATCTTTGGAGACATGGTTTGTG 
CTGGCAATGCCCAAGGCGGGAAGGATGCCTGCTTCGGTGACTCAGGTGGACCCTTGGCCTGTAACAA 
GAATGGACTGTGGTATCAGATTGGAGTCGTGAGCTGGGGAGTGGGCTGTGGTCGGCCCAATCGGCCC 
GGTGTCTACACCAATATCAGCCACCACTTTGAGTGGATCCAGAAGCTGATGGCCCAGAGTGGCATGT 
CCCAGCCAGACCCCTCCTGGCCACTACTCTTTTTCCCTCTTCTCTGGGCTCTCCCACTCCTGGGGCC 
GGTCTGAGCCTACCTGAGC CC ATGC 




ORE Start: ATG at 23 | |ORF Stop: TGA at 809 





SEQ ID NO: 74 |262 aa ]MW at 28826.7kD 


NOV14a, 
CG144906-01 
Protein Sequence 


MGARGALLLALLLARAGLRKPESQEAAPL SG PCGRRVI TSR I VGGEDAELGRWPWQGSLRLWDSHVC 
GVSLL SHRWALTAAHC FETYSDLSDPSGWMVQFGQLTSMPSSTFEFENRTDC WVTGWGYIKEDEALP 
S PHTLQEVQVAI INNSMCNHLFLKYSFRKDI FGDMVCAGNAQGGKDACFGDSGGPLACNKNGLWYQ I 
GVVSWGVGCGRPNRPGVYTNI SHHFEWIQKLMAQSGMSQ PDPSWPLLFF PLLWAL PLLG PV 





SEQ ID NO: 75 |989bp | 


NOV14b, 
CG144906-02 
DNA Sequence 


AATCGCCCTTCGCGGGAGAGGAGGCCATGGGCGCGCGCGGGGCGCTGCTGCTGGCGCTGCTGCTGGC 
TCGGGCTGGACTCAGGAAGCCGGAGTCGCAGGAGGCGGCGCCCTTATCAGGACCATGCGGCCGACGG 
GTCATCACGTCGCGCATCGTGGGTGGAGAGGACGCCGAACTCGGGCGTTGGCCGTGGCAGGGGAGCC 
TGCGCCTGTGGGATTCCCACGTATGCGGAGTGAGCCTGCTCAGCCACCGCTGGGCACTCACGGCGGC 
GCACTGCTTTGAAACCTATAGTGACCTTAGTGATCCCTCCGGGTGGATGGTCCAGTTTGGCCAGCTG 
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ACTTCCATGCCATCCTTCTGGAGCCTGCA^ 

TGAGCCCTCGCTACCTGGGGAATTCACCCTATGACATTGCCTTGGTGAAGCTGTCTGCACCTGTCAC 
CTACACTAAACACATCCAGCCCATCTGTCTCCAGGCCTCCACATTTGAGTTTGAGAACCGGACAGAC 
TGCTGGGTGACTGGCTGGGGGTACATCAAAGAGGATGAGGCACTGCCATCTCCCCACACCCTCCAGG 
AAGTTCAGGTCGCCATCATAAACAACTCTATGTGCAACCACCTCTTCCTCAAGTACAGTTTCCGCAA 
GGACATCTTTGGAGACATGGTTTGTGCTGGCAATGCCCAAGGCGGGAAGGATGCCTGCTTCGGTGAC 
TCAGGTGGACCCTTGGCC TGTAACAGGAATGGACTGTGGTATCAG ATTGGAGTCGTGAGC TGGGGAG 
TGGGCTGTGGTCGGCCCAATCGGCCCGGTGTCTACACCAATATCAGCCACCACTTTGAGTGGATCCA 
GAAGCTGATGGCCCAGAGTGGCATGTCCCAGCCAGACCCCTCCTGGCCACTACTCTTTTTCCCTCTT 
CTCTGGGCTCTCCCACTCCTGGGGCCGGTCTGAGCCTACCTTAGCCCATGC 




ORF Start: ATG at 27 


ORF Stop: TGA at 969 





SEQIDNO:76 |314 aa 


MWat 34911.6kD 


NOV14b, 
CG144906-02 
Protein Sequence 


MGARGALLLALLLARAGLRKPESQEAAPLSGPCGRRVI TSR I VGGEDAELGRWPWQGSLRLWDSHVC 
GVSLLSHRWALTAAHCPETYSDLSDPSGimVQFGQLTSMPSFWSLQAYYTRYFVSNIYIiSPRYLGNS 
FTOIALVKLSAPVTYTKHIQPICLQASTFEFENRTDCWVTGWGYIKEDEAXiPSPHTLQEVQVAIINN 
SMCNHLFLK YS FRKDI FGDMVC AGNAQGGKDACFGDSGGPIjACNRNGLWYQ IGWSWGVGCGRPNRP 
GVYTNISHHFEWIQKLMAQSGMSQPDPSWPLLFFPLLWALPLLGPV 



Sequence comparison of the above protein sequences yields the following sequence 
relationships shown in Table 14B. 
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Table 14B. Comparison of NOV14a against NOV14b. 


Protein Sequence 


NOV14a Residues/ 
Match Residues 


Identities/ 

Similarities for the Matched Region 


NOV14b 


20..240 
20..292 


219/273 (80%) 
221/273 (80%) 



Further analysis of the NOV14a protein yielded the following properties shown in 
15 Table 14C. 



Table 14C. Protein Sequence Properties NOV14a 


PSort analysis: 


0.5422 probability located in outside; 0.4639 probability located in lysosome 
(lumen); 0.2779 probability located in microbody (peroxisome); 0.1900 
probability located in plasma membrane 


SignalP analysis: 


Cleavage site between residues 20 and 21 
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A search or tne JNU V 14a protein against the Ueneseq mtabte6^&pi®p^x^ - 
database that contains sequences published in patents and patent publication, yielded 
several homologous proteins shown in Table 14D. 



Table 14D. Geneseq Results for NOV14a 


Geneseq 
Identifier 


Protein/Organism/Length 
[Patent #, Date] 


NOV14a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for 
the Matched 
Region 


Expect 
Value 


AAE17010 


Human eosinophil serine 
protease-1 (esp-1) like 
enzyme #2 - Homo sapiens, 
314 aa. [WO200198503-A2, 
27-DEC-2001] 


1..262 
1..314 


262/314(83%) 
262/314(83%) 


e-154 


AAB80256 


Human PRO303 protein - 
Homo sapiens, 314 aa. 
[WO200104311-A1, 
18-JAN-2001] 


1..262 
1..314 


262/314 (83%) 
262/314 (83%) 


e-154 


AAU01569 


Human secreted protein 
immunogenic epitope 
encoded by gene #9 - Homo 
sapiens, 315 aa. 
[WO200123547-A1, 
05-APR-2001] 


1..262 
1..314 


262/314(83%) 
262/314 (83%) 


e-154 


AAU02223 


Human extracellular serine 
protease TADG-16 - Homo 
sapiens, 314 aa. 
[WO200127257-A1, 
19-APR-2001] 


1..262 
1..314 


262/314 (83%) 
262/314 (83%) 


e-154 


AAY91871 


Human cancer-specific gene 
protein, Prol04 - Homo 
sapiens, 327 aa. 
[WO200016805-A1, 
30-MAR-2000] 


1..262 
14..327 


262/314(83%) 
262/314 (83%) 


e-154 



In a BLAST search of public sequence datbases, the NOV14a protein was found to 
have homology to the proteins shown in the BLASTP data in Table 14E. 



Table 14E. Public BLASTP Results for NOV14a 
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Protein 

Accession 

Number 


Protein/Organism/Length 


NOV14a 1 
Residues/ 
Match 
Residues 


Itrehfiflesr 
Similarities for 
the Matched 
Portion 


! — —1 

Expect 
Value 


Q9Y6M0 


Testisin precursor (EC 
3.4.21.-) (Eosinophil serine 
protease 1) (ESP- 1) - Homo 

oapiwJio \liliiiiaiijy u it aa. 


1..262 
1..314 


262/314(83%) 
262/314 (83%) 


e-154 


Q9JHI7 


Testisin precursor (EC 
3.4.2 L-) (Tryptase 4) - Mus 

Tniicr*n1iic /IV/f niicf*^ 
lllUoL-UIUo yiVHJUoC/^ J£rr aa. 


1..261 
1..323 


179/326 (54%) 
210/326 (63%) 


le-98 


Q920S2 


Testis serine protease- 1 - Mus 
musculus (Mouse), 322 aa. 


1..261 
1..321 


150/325 (46%) 
180/325 (55%) 


2e-69 


Q9D4I3 


4931440B09Rik protein - 
Mus musculus (Mouse), 282 
aa. 


32..261 
2..281 


135/283 (47%) 
161/283 (56%) 


le-66 


Q9PVX7 


Epidermis specific serine 
protease - Xenopus laevis 
(African clawed frog), 389 aa. 


33..244 
17..277 


100/264 (37%) 
136/264 (50%) 


3e-45 



PFam analysis predicts that the NOV14a protein contains the domains shown in the 
Table 14F. 

5 



Table 14F. Domain Analysis of NOV14a 


Pfam Domain 


NOV14a Match Region 


Identities/ 
Similarities 

for the Matched Region 


Expect Value 


trypsin 


42..85 


24/51 (47%) 
36/51 (71%) 


2.3e-13 


trypsin 


119..229 


52/121 (43%) 
92/121 (76%) 


9e-43 



Example 15, 

10 The NO VI 5 clone was analyzed, and the nucleotide and encoded polypeptide 

sequences are shown in Table 15 A. 



Table ISA. NOV15 Sequence Analysis 

|SEQIPNO:77 1716 bp 1 
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NOV 15a 
CG144997-01 
DNA Sequence 


GAGTGAC^GATGA<^T<^TTTCTGTTCCTGGCCCAC^ 

GCGGCTCTCGCGGGTTCGGGATGTTCTATGCCGTGAGGAGGGGCCGCAAGACCGGGGTCTTTCTGAC 
CTGGAATGAGTGCAGAGACACGTTTTCCTACATGGGAGACTTCGTCGTCGTCTACACTGATGGCTGC 
TGCTCCAGTAATGGGCGTAGAAGGCCGCGAGCAGGAATCGGCGTTTACTGGGGGCCGGGCCATCCTT 
TAAATGTAGGCATTAGACTTCCTGGGCGGCAGACAAACCAAAGAGCGGAAATTCATGCAGCCTGCAA 
AGC CAT TG AAC AAGC AAAG AC TC AAAACATC AATAAAC TGGTTC TG T AT AC AG AC AGTATGTTTACG 
ATAAATGGTATAACTAACTGGGTTCAAGGTTGGAAGAAAAATGGGTCGAAGACAAGTGCAGGGAAAG 
AGGTGATCAACAAAGAGGACTTTGTGGCACTGGAGAGGCTTACCCAGGGGATGGACATTCAGTGGAT 
GCATGTTCCTGGTCATTCGGGATTTATAGGCAATGAAGAAGCTGACAGATTAGCCAGAGAAGGAGCT 
AAACAATCGGAAGACTGAGCCATGTGACTTTAGTCCTTGGGAGAACTTGAGCCAGCGGCTGTCTTGC 


TGCCTGTACTTACTGGTGTGGAAAATAGCCTGCAGGTAGGACCATT 




ORF Start: ATG at 10 f |ORF Stop: TGA at 619 





SEQ ID NO: 78 |203aa 


|MWat22889.0kD 


NOV15a, 
CG144997-01 
Protein Sequence 


MSWFLFIJUIRVALAALPCRRGSRGFGMFYAVRRGRKTGWLTWI^ 

NGRRRPRAG IGVYWGPGH PLNVG IRL PGRQTNQRAE I HAACKAI EQ AKTQNI NKLVLYTDSMFT ING 

ITNWVQGWKXNGWKTSAGKEVINKEDFVALERLTQ 

ED 





SEQ ID NO: 79 |631 bp | 


NOV15b, 
278693648 DNA 
Sequence 


CUtCCGGATCGACCATGAGCTGGTTTCTGTTCCTGGCCCACAG 

CGCCGCGGCTCTCGCGGGTTCGGGATGTTCTATGCCGTGAGGAGGGGCCGCAAGACCGGGGTCTTTC 
TGACCTGGAATGAGTGCAGAGACACGTTTTCCTACATCGGAGACTTCGTCGTCGTCTACACTGATGG 
CTGCTGCTCCAGTAATGGGCGTAGAAG^CCGCGAGCAGGAATCGGCGTTTACTGGGGGCCGGGCCAT 
CCTTTAAATGTAGGCATTAGACTTCCTGGGCGGCAGACAAACCAAAGAGCGGAAATTCATGCAGCCT 
GCAAAGCCATTGAACAAGCAAAGACTCAAAACATCAATAAACTGGTTCTGTATACAGACAGTATGTT 
TACGATAAATGGTATAACTAACTGGGTTCAAGGTTGGAAGAAAAAT<KK3TGGAAGACAAGTGCAGGG 
AAAGAGG TG ATC AACAAAG AGG AC T TTGT<3G C AC TGG AG AG^CTTACCC AGGGG ATCGACATTCAGT 
GGATGCAT<3TTCCTGGTCATTCGGGATTTATAGGCAAT<^GAAGCTGACAGATTAGCCAGAGAAGG 
AGCTAAACAATCGGAAGACC TCGAGGGC 




ORF Start: at 2 |ORF Stop: end of sequence 





SEQ ID NO: 80 |210 aa |MW at 23534.6kD 


NOV15b, 
278693648 
Protein Sequence 


TGSTMSWFLFLAHRVALAAL PCRRG SRGFGMF YAVRRGRKTG VFLTWNECROTFSYMGDFVVVYTDG 
CC S SNGRRR PRAG IGVYWG PGHPLNVG I RL PGRQTNQRAEIHAACKAI EQAKTQNINKLVLYTDSMF 
TINGITNWQGWKKNGTOTSAGKEVINKEDFVALERLTQGKDIQWM^ 
AKQSEDLEG 





SEQ ID NO: 81 |586bp 


NOV15c, 
278480974 DNA 
Sequence 


caccggatccgccttgccctgccgccgcggctctcgcgggttcgggatgttctatgccgtgaggagg 
g^cgcaagaccggggtctttctgacctggaatgagtgcagagacacgttttcctaca 
tcgtcgtcgtctacactgatggctgct<x:tccagtaatgggcgtagaag^ 
cgtttactgggggccgggccatcctttaaatgtaggcatta^^ 

ag^gcggaaattcatgcagcctgcaaagccattgaacaagcaaagactcaaaacatcaataaactgg 
ttctgtatacagacagtatgtttacgataaat<3gtataactaactgggttcaaggttggaaga7vaaa 
tgggtggaagacaagtgcagggaaagaggtcatcaacaaagaggactttgt^ 
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ACCCAG^ATCKSACATTCAGTGGATC^ 

CTGACAGATTAGCCAGAGAAGGAGCTAAACAATCGGAAGACCTCGAGGGC 




ORF Start: at 2 |ORF Stop: end of sequence 





SEQ ID NO: 82 |l95 aa |MW at 2l789.5kP 


NOV15c, 
278480974 
Protein Sequence 


TGSAJjPCRRGSRGFGMFYAVRRGRKTGVFLTWNECRDTFSYMGDF^ 

VYWGPGHPLNVGIRLPGRQTNQRAE IHAACKAI EQAKTQNINKL VL YTDSMFT I NG ITNWVQGWKKN 
GWTSAGKEVINKEDFVALERLTQGMDI QWMHVPGHSGFIGNEEADRLAR EGAKQS EDLEG 





SEQ ID NO: 83 |457bp \ 


NOV15d, 
278498047 DNA 
Sequence 


CACCGGATCCGGAGACTTCGTCGTCGTCTACACTGATGGCTGCTGCTCCAGTAATGGGCGTAGAAGG 
CCGCGAGCAGGAATCGGCGTTTACTGGGGGCCGGGCCATCCTTTAAATGTAGGCATTAGACTTCCTG 
GGCGGCAGACAAACCAAAGAGCGGAAATTCATGCAGCCTGCAAAGCCATTGAACAAGCAAAGACTCA 
AAACATCAATAAACTGGTTCTGTATACAGACAGTATGTTTACGATAAATGGTATAACTAACTGGGTT 
CAAGGTTGGAAGAAAAATGGGTGGAAGACAAGTGCAGGGAAAGAGGTGATCAACAAAGAGGACTTTG 
TGGCACTGGAGAGGCTTACCCAGGGGATGGACATTCAGTGGATGCATGTTCCTGGTCATTCGGGATT 
TATAGGCAATGAAGAAGCTGACAGATTAGCCAGAGAAGGAGCTAAACTCGAGGGC 




ORF Start: at 2 |ORF Stop: end of sequence 





SEQ ID NO: 84 |l52 aa JMW at 16753.8kD 


NOV15d, 
278498047 
Protein Sequence 


TGSGDFVVVYTIXSCCSSNGRRRPRAGIGVTOGPGHPLWGIRL^ 

NINKLVLYTDSMFTINGITNWVQGWKKN^ 

IGNEEADRLAREGAKLEG 





SEQ ID NO: 85 (965 bp | 


NOV15e, 
CG144997-02 
DNA Sequence 


GAGTGAGCGAT<^GCTGGTTTCTGTTCCTGGCCCACAGAGTCGCCTTGGCCGCCTTGCCCTGCCGCC 
GCGGC TCTCGCGGGTTCGGGATGTTC TATGCCGTGAGGAGGGGCCGC AAGACCGGGGTCTTTCTGAC 
CTGGAATGAGTGCAGAGCACAGGTGGACCGGTTTCCTGCTGCCAGATTTAAGAAGTTTGCCACAGAG 
GATGAGGCCTGGGCCTTTGTCAGGAAATCTGCAAGCCCGGAAGTTTCAGAAGGGCATGAAAATCAAC 
ATGGACAAGAATCGGAGGCGAAAGCCAGCAAGCGACTCCGTGAGCCACTGGATGGAGATGGACATGA 
AAGCGCAGAGCCGTATGCAAAGCACATGAAGCCGAGCGTGGAGCCGGCGCCTCCAGTTAGCAGAGAC 
ACGTTTTCCTACATGGGAGACTTCGTCGTCGTCTACACTGATGGCTGCTGCTCCAGTAATGGGCGTA 
GAAGGCCGCGAGCAGGAATCGGCGTTTACTGGGGGCCAGGCCATCCTTTAAATGTAGGCATTAGACT 
TCC TGGGCGGC AGACAAAC C AAAGAGCGGAAATTCATGCAGCC TGCAAAGCC ATTGAACAAGCAAAG 
ACTCAAAACATCAATAAACTGGTTCTGTATACAGACAGTATGTTTACGATAAATGGTATAACTAACT 
GGGTTCAAGGTTGGAAGAAAAATGGGTGGAAGACAAGTGCAGGGAAAGAGGTGATCAACAAAGAGGA 
CTTTGTGGCACTGGAGAGGCTTACCGAGGGGATGGAGATTCAGTGGAT^ 

GGATTTATAGGCAATGAAGAAGCTGACAGATTAGCCAGAGAAGGAGCTAAACAATCGGAAGACTGAG 
CCATGTGACTTTAGTCCTTGGGAGAACTTGAGCCAGCGGCTGTCTTGCTGCCTGTACTTACTGGTGT 


GG AAAATAGCC TGC AGGTAGGACC ATT 




ORF Start: ATG at 10 |ORF Stop: TGA at 868 
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SEQ ID NO: 86 J286 aa (mW at 32098.0kD 


NOV15e, 
CG144997-02 
Protein Sequence 


MSWFLFLAHRVALAALPCRRGSRGFGMFYAVRRGRKTGVFLTWNECRAQTO 

WAFVRKSASPEVSEGHENQHGQESEAKASKRLREPLTODGHESAEPYAKHMKPSVEPAPPVSRIOT 
YMGDFVWYTDGCC SSNGRRRPRAG IG VYWG PGHPLNVG IRL PGRQTNQRAE I HAACKAI EQAKTQN 
Il^VLYTDSMFTINGITNWQGWKKNGWKTSAGKEVINKEDFVALERLTQGMDIQWMHVPGHSGFI 
GNEEADRLAREG AKQS ED 



5 

Sequence comparison of the above protein sequences yields the following sequence 
relationships shown in Table 15B. 
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Table 15B. Comparison of NOV15a against NOV15b through NOV15e. 


Protein Sequence 


NOV15a Residues/ 
Match Residues 


Identities/ 

Similarities for the Matched Region 


NOV15b 


1..203 
5..207 


203/203 (100%) 
203/203 (100%) 


NOV15c 


14..203 
3..192 


189/190(99%) 
190/190 (99%) 


NOV15d 


54.. 199 
4.. 149 


146/146 (100%) 
146/146 (100%) 


NOV15e 


47..203 
130..286 


157/157 (100%) 
157/157 (100%) 



Further analysis of the NOV15a protein yielded the following properties shown in 
Table 15C. 
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Table 15C. Protein Sequence Properties NOVlSa 


PSort analysis: 


0.3700 probability located in outside; 0.1805 probability located in microbody 
(peroxisome); 0.1080 probability located in nucleus; 0.1000 probability 
located in endoplasmic reticulum (membrane) 


SignalP analysis: 


Cleavage site between residues 15 and 16 



A search of the NOV 15a protein against the Geneseq database, a proprietary 
20 database that contains sequences published in patents and patent publication, yielded 
several homologous proteins shown in Table 15D. 
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Table 15D. Geneseq Results for NOV15a 


Geneseq 
Identifier 


Protein/Organism/Length 
[Patent*, Date] 


NOV15a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for the 
Matched Region 


Expect 
Value 


AAY70235 


Human RNA-associated 
protein-16 (RNAAP-16) - 
Homo sapiens, 286 aa. 
[WO200011171-A2, 
02-MAR-2000J 


47..203 
130..286 


157/157 (100%) 
157/157 (100%) 


6e-92 


AAB97508 


Human type II RNase H 
protein - Homo sapiens, 286 
aa. [WO200123613-A1, 
05-APR-2001] 


47..203 
130..286 


156/157 (99%) 
157/157 (99%) 


le-91 


AAY25094 


Human type 2 RNase H 
protein - Homo sapiens, 286 
aa. [W09928447-A1, 
10-JUN-1999] 


47..203 
130..286 


156/157 (99%) 
157/157 (99%) 


le-91 


ABB83371 


Human wild-type RNase HI 
- Homo sapiens, 286 aa. 
[WO200240635-A2, 
23-MAY-2002] 


47..203 
130..286 


156/157 (99%) 
156/157 (99%) 


2e-90 


ABB83374 


Mutant RNase HI, E186Q - 
Homo sapiens, 286 aa. 
[WO200240635-A2, 
23-MAY-2002] 


47..203 
130..286 


155/157 (98%) 
156/157 (98%) 


5e-90 



5 In a BLAST search of public sequence datbases, the NOV 15a protein was found to 

have homology to the proteins shown in the BLASTP data in Table 15E. 



Table 15E. Public BLASTP Results for NOV15a 


Protein 

Accession 

Number 


Protein/Organism/Length 


NOVlSa 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for the 
Matched Portion 


Expect 
Value 


060930 


Ribonuclease HI (EC 
3.1.26.4) (RNase HI) 
(Ribonuclease H type II) - 
Homo sapiens (Human), 286 
aa. 


47..203 
130..286 


157/157(100%) 
157/157 (100%) 


2e-91 



162 



WO 03/029424 



PCT/US02/31373 



Q8VCR6 


Ribonuclease HI - Mus 
musculus (Mouse), 285 aa. 


47..203 1 
129..285 


150/157 (95%) 


5e&?"" "■ 


070338 


Ribonuclease HI (EC 
3.1.26.4) (RNase HI) - Mus 
musculus (Mouse), 285 aa. 


47..203 
129..285 


139/157 (88%) 
150/157 (95%) 


5e-83 


Q91953 


mRNA, complete cds, clone 
CLFEST65 - Gallus gallus 
(Chicken), 293 aa. 


50..202 
140..292 


117/153(76%) 
135/153 (87%) 


4e-70 


Q21024 


F59A6.6 protein - 
Caenorhabditis elegans, 369 
aa. 


58.. 199 
222..363 


65/142 (45%) 
93/142 (64%) 


3e-32 



PFam analysis predicts that the NOV 15a protein contains the domains shown in the 
Table 15F. 

5 



Table 15F. Domain Analysis of NOV15a 


Pfam Domain 


NOV15a Match Region 


Identities/ 
Similarities 

for the Matched Region 


Expect Value 


rnaseH 


54..199 


65/176(37%) 
125/176 (71%) 


2.8e-54 



Example 16. 

10 The NOV16 clone was analyzed, and the nucleotide and encoded polypeptide 

sequences are shown in Table 16A. 



Table 16A. NOV16 Sequence Analysis 




SEQIDNO:87 |2274bp j 


NOV16a, 
CG145494-01 
DNA Sequence 


CCCCTAGTGACACTCAGGAAATGCTTGTCTCCGGCTGTTAAGGAATAATTTCAGAGTACTATGGATC 


ATGCTGAAGAAAATGAAATCCTTGCAGCAACCCAGAGGTACTATGTGGAAAGGCCTATCTTTAGTCA 
TCCGGTCCTCCAGGAAAGACTACACACAAAGGACAAGGTTCCTGATTCCATTGCGGATAAGCTGAAA 
CAGGCATTCACATGTACTCCTAAAAAAATAAGAAATAT<^TTTATATGTTCCTACCCATAACTAAAT 

GCTTCAGCTTCCTCAAGGCTTAGCCTTTGCAATGCTGGCAGCTGTGCCTCCAATATTTGGCCTGTAC 




TTGCTGTTATTAGCCTGATGATTGGTGGTGTAGCTGTTCGATTAGTACCAGATGATATAGTCATTCC 
AGGAGGAGTAAATGCAACCAATGGCACAGAGGCCAGAGATGCCTTGAGAGTGAAAGTCGCCATGTCT 
GTGACCTTACTTTCAGGAATCATTCAGTTTTGCCTAGGTGTC 

ATCTCACAGAGCCTCTGGTCCGTGGGTTTACCACCGCAGCAGCTGTGCATGTCTTCACCTCCATGTT 
AAAATATCTGTTTGGAGTTAAAACAAAGCGGTACAGTGGAATCTTTTCCGTGGTGTATAGTACAGTT 
GCTGTGTTGCAGAATGTTAAAAACCTCAACGTGTGTTCCCTAGGCGTCGGGCTGATGGTTTTTGGTT 




GTTCT TTGCGGTCGTAATGGGAACTGGC ATTTC AGC TGGGTTTAACTTGAAAGAATC ATACAATGTG 
GATGTCGTTGGAACACTTCCTCTAGGGCTGCTACCTCCAGCCAATCCGGACACCAGCCTCTTCCACC 
TTGTGTACGTAGATGCCATTGCCATAGCCATCGTTGGATTTTCAGTC 


163 



WO 03/029424 



PCT/US02/31373 





CTTAGCAAATAAACATGGCTACCAGGTTGACGGCA^^ 

TCCATTGGCTCACTCO^CAGACCTTTTCAATTTCATGCTCCTTGTCTCGAAGCCTTGTTCAGGAGG 
GAACCGGTGGGAAGACACAGGCTGTGCTGTCGGCCATTGTGATTGTCAACCTGAAGGGAATGTTTAT 
GCAGTTCTCAGATCTCCCCTTTTTCTGGAGAACCAGCAAAATAGAGCTGACCATCTGGCTTACCACT 
TTTGTGTCCTCCTTGTTCCTGGGATTGGACTATCGTTTGATCACTGCTGTGATCATTGCTCTGCTGA 
CTGTGATTTACAGAACACAGAGTCCAAGCTACAAAGTCCTTGGAAAGCTTCCTGAAACTGATGTGTA 
TATTGATATAGACGCATATGAGGAGGTGAAAGAAATTCCTGGAATAAAAATATTTCAAATAAATGCA 
CCAATTTACTATGCAAATAGCGACTTGTATAGCAATGCATTAAAACGAAAGACTGGAGTGAACCCAG 
CAGTCATCATGGGAGCAAGGAGAAAGGCCATGCGGAAGTACGCTAAGGAAGTCGGAAATGCAAATAT 
GGCCAACGCAACTGTTGTCAAAGCAGATGCAGAAGTAGATGGAGAGGATGCTACCAAGCCTGAAGAA 
G AGG ATGGTG AAGT AAAATATC CC CCAAT AGTG ATCAAAAG CAC ATTT CC TG AGG AAATGC AAAGAT 
TTATGCCCCCAGGGGATAACGTCCACACTGTCATTTTGGATTTCACTCAAGTCAATTTTATTGATTC 
TGTTGGAGTGAA71ACTCTGGCAGGGATTGTAAAAGAATATGGAGACGTCGGTATATATGTATACTTA 
GCAGGATGCAGTGCACAAGTTGTGAATGACCTCACTCGGAATAGATTTTTTGAAAATCCTGCCCTAT 
GGGAGCTGCTGTTCCACAGCATTCATGATGCAGTTTTAGGCAGCCAACTTAGAGAGGCACTTGCTGA 
AC AGG AAGC CTCGGC TCCCCCTTCCCAGGAGG ACTTGGAGCCC AATGCC ACTCC TGC C ACTCC TGAG 
GCATAQATGAGGACCTCACCCTAGGATGGGGTTATAAGCCTCTCATGAAGTTCATAATTTACA 




ORF Start: ATG at 61 j jORF Stop: TAG at 2215 





SEQ ID NO: 88 |718 aa |MW at 78546.4kD 


NOV16a, 
CG145494-01 
Protein Sequence 


MDHAEENEILAATQRYYVERPIFSHPVLQERLHTKDKVPDSIADKLKQAFTCTPKKIRNIIYMFLP^ 
TKWLPAYKFKEYVLGDLVSGI STGVLQLPQGLAFAMLAAVPPIFGLYSSFYPVTMYCFLGTSRHI S I 
GPFAVI SLMIGGVAVRLVPDDI VIPGGVNATNGTEARDALRVKVAMSVTLLSGI IQFCLGVCRFGFV 
A I YLTEPIjVRGFTTAAAVHVFTSMLKYLFGVKTKRYSG I FSWYSTVAVLQNVKNLNVC S LGVGLMV 
FGLLLGGKEFNERFKEKLPAPIPLEFFAWMGTGISAGFNLKESYNVDWGTLPLGLLPPANPDTSL 
FHLVYVDAIAIAIVGFSVTISMAKTLANKHGYQVDC^ 

QEGTGGKTQAVLS AXVIVNLKGMFMQFSDL PFFWRTSK I ELTIWLTTFVSSLFLGLDYGLITAVI I A 
LLTVIYRTQSPSYKVLGKLPETDVYIDIDAYEEVKEIPGIKIFQINAPIYYANSDLYSNALKRKTGV 
NPAV I MGARRKAMRKYAK EVGNANMANATVVKADAEVDG EDATKPE EEDGEVKY PP I V I K S TFPEEM 
QRFMP PGDNVHTVIIJDFTQVNFIDSVGVKTLAG I VKEYGDVGI YVYLAGC S AQWNDLTRNRFFENP 
ALWELLFHS IHDAVLG SQLREALAEQEAS APPSQEDLEPNAT PAT PEA 



Further analysis of the NOV16a protein yielded the following properties shown in 
Table 16B. 



Table 16B. Protein Sequence Properties NOV16a 


PSort analysis: 


0.6000 probability located in plasma membrane; 0.4000 probability located in 
Golgi body; 0.3200 probability located in nucleus; 0.3000 probability located 
in endoplasmic reticulum (membrane) 


SignalP analysis: 


No Known Signal Sequence Predicted 



A search of the NOV16a protein against the Geneseq database, a proprietary 
database that contains sequences published in patents and patent publication, yielded 
several homologous proteins shown in Table 16C. 



Table 16C Geneseq Results for NOV16a 



164 



WO 03/029424 



PCT7US02/31373 



Geneseq 
Identifier 


Protein/Organism/Length 
rPatpnt # Date! 


NOV16a " 
Residues/ 

lTJ.it IvU 

Residues 


Similarities for 

flip lVTsitrhpH 

lUv xTXmm LL.LI vU 

Region 


,r 3 JL 3 .ir . 
Expect 

Vf*liip 


AAY71067 


Human membrane transport 
protein, MTRP-12 - Homo 
sapiens, 758 aa. 
rWO200026245-A2 
ll-MAY-2000] 


9..684 
15..738 


291/741 (39%) 
433/741 (58%) 


e-148 


AAG67162 


Amino acid sequence of a 
human 32613 transporter 

nolvnpntide - Hrvmo saDiftTK 

yi\Jl J UwL/llUV X X\Jill\J OttL/JIwllO, 

751 aa. [WO200164875-A2, 
07-SEP-2001] 


9..684 
15..731 


289/734 (39%) 
432/734 (58%) 


e-147 


ABG61914 


Prostate cancer-associated 

nrnt<*in #1 IS- A/fa mm alia 

790 aa. [WO200230268-A2, 
18-APR-2002] 


16..699 
20 741 


268/723 (37%) 
4107723 Cil%\ 


e-144 


AAM51696 


Human pendrin SEQ ID NO 
2 - Homo sapiens, 780 aa. 
[JP2001228146-A, 
24-AUG-2001] 


16..699 
20..741 


268/723 (37%) 
419/723 (57%) 


e-144 


AAM51695 


Mouse pendrin SEQ ID NO 1 
- Mus sp, 780 aa. 
[JP2001228146-A, 
24-AUG-2001] 


16..688 
20..730 


270/713 (37%) 
414/713 (57%) 


e-142 



In a BLAST search of public sequence datbases, the NOV16a protein was found to 
have homology to the proteins shown in the BLASTP data in Table 16D. 

5 



Table 16D. Public BLASTP Results for NOV16a 


Protein 

Accession 

Number 


Protein/Organisni/Length 


NOV16a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for 
the Matched 
Portion 


Expect 
Value 


P58743 


Prestin - Homo sapiens 
(Human), 744 aa. 


1..718 
1..744 


718/744(96%) 
718/744 (96%) 


0.0 


Q9JKQ2 


Prestin - Meriones 
unguiculatus (Mongolian 
jird) (Mongolian gerbil), 744 
aa. 


L.718 
1..744 


679/744 (91%) 
699/744 (93%) 


0.0 


Q99NH7 


Prestin - Mus musculus 
(Mouse), 744 aa. 


1..718 
1..744 


680/744(91%) 
700/744(93%) 


0.0 



165 



WO 03/029424 



PCTYUS02/31373 



Q9EPH0 


Prestiri - Rattus norvegicus 
(Rat), 744 aa. 


1..718 F ' 
1..744 


699/744 (92%) 




AAH28856 


Solute carrier family 26, 
member 6 - Mus musculus 
(Mouse), 735 aa. 


16..684 
8..715 


282/718 (39%) 
432/718 (59%) 


e-148 



PFam analysis predicts that the NOV16a protein contains the domains shown in the 
Table 16E. 

5 



Table 16E. Domain Analysis of NQV16a 


Pfam Domain 


NOV16a Match Region 


Identities/ 
Similarities 

for the Matched Region 


Expect Value 


COX3 


334..458 


31/266(12%) 
80/266 (30%) 


0.7 


Sulfatejtransp 


193..477 


111/328(34%) 
234/328 (71%) 


7e-78 


STAS 


500..683 


34/188(18%) 
124/188 (66%) 


1.4e-12 



Example 17. 

10 The NOV17 clone was analyzed, and the nucleotide and encoded polypeptide 

sequences are shown in Table 17A. 



Table 17A. NOV17 Sequence Analysis 




SEQBDNO:89 |2124bp J 


NOV17a, 
CG145722-01 
DNA Sequence 


AAGC TG AGG TC TT ATAG ATTGG TGGTAC T T AAGGCAG AAAATTAACAC CGTGTT TTGT AGC TGT T AG 


TTGGTAGAGGGAAATTCAGGCTACCGTCGCGAAACCTGCAGGTTAAGTTATTTTCTCCTCCCTGCTT 


CTGTAGGTTCACAGCGTTCCCTTCTGATAGAGCTTTTTGTCTGTGTTGTAAAGCTC 


TGGATGACAAAGATATTGACAAAGAAC^AAGGCAGAAATTAAACTTTTCCTATTGTGAG^ 

GATTGAAGGGCAGAAGAAAGTAGAAGAAAGCAGGGAGGCTTCGAGCCAAACCCCAGAGAAGGGTC 

GTGCAGGATTCAGAGGCAAAG^TACACCACCTTGGACTCCCCTTAGC^ACGTGCATGAGCTCGACA 

CATCTTCGGAAAAAGACAAAGAAAGTCCAGATCAGATTTTGAGGACTCCAGTGTCACACCCTCTCAA 

ATGTCCTGAGACACCAGCCCAACCAGACAGCAGGAGCAAGCTGCTGCCCAGTG^CAGCCCCTCTACT 

CCCAAAACCATGCTGAGCCGGTTGGTGATTTCTCCAACAGGGAAGCTTCCTTCCAGAGGCCCTAAGC 

ATTTGAAGCTCACACCTGCTCCCCTCAAGGATGAGATGACCTCATTGGCTCTGGTCAATATTAATCC 

CTTCACTCCAGAGTCCTATAAAAAATTATTTCTT 

GTTTTACGAGAAACCAACATGGCTTCCCGCTATGAAAAAGAATTCTTGGAGGTTGAAAAAATTGGGG 
TTG GCG AATT TGG TACAGTC TACAAGTGCATTAAGAGGCTGGATGGATGTGTTTATGCAAT AAAGCG 
CTCTATGAAAAC T TTTAC AGAATTATCAAATGAGAATTCG<3CTTTGCATGAAGTTTATGCTC ACGC A 
GTGCTTGGGCATCACCCCCATGTGGTACGTTOCTAT^ 

TTCAGAATGAATACTGCAATGGTGGGAGTTTGCAAGCTGCTATATCTGAAAACACTAAGTCTGGCAA 
TCATTTTGAAGAGCCAAAACTCAAGGACATCCTTCTACAGATTTCCCTTGGCCTTAATTACATCCAC 
AACTCTAGCATGGTACACCTGGACATCAAACCTAGTAATATATTCATTTGTCACAAGATGCAAAGTG 
AATCCTCTGGAGTCATAGAAGAAGTTGAAAATGAAGCTGATTGGTTTCTCTCTGCCAATGTGATGTA 
TAAAATTGGTGACCTGGGCCACGCAACATCAATAAACAAACCCAAAGTGGAAGAAGGAGATAGTCGC 



166 



WO 03/029424 PC1YUS02/31373 



TTCCTGGC^^ 

GATTAACAATTCCAGTCGCTGCAGGAGCAGAGTCATTGCCCACCAATGGTGCTGCATGGCACCATAT 
CCGCAAGGGTAAC TTTCCGGACGTTCCTC AGGAGC TCTCAGAAAGCTTTTCC AGTC TGCTC AAGAAC 
ATGATCCAACCTGATGCCGAACAGAGACCTTCTGCAGCAGCTCTGGCCAGAAATACAGTTCTCCGGC 
CTTCCCTGGGAAAAACAGAAGAGCTCCAACAGCAGCTGAATTTGGAAAAGTTCAAGACTGCCACACT 
GGAAAGGGAACTGAGAGAAGCCCAGCAGGCCCAGTCACCCCAGGGATATACCCATCATGGTGACACT 
GGGGTCTCTGGGAC C C AC ACAGGATCAAGAAGC ACAAAACGCC TGGTGGGAGGAAAGAGTGCAAGGT 
CTTCAAGCTTTACCTGTGAGTA ATCTTCCCCTTAAGAACTCATTTTGCAGCCGGGCGTGGTGGCTCA 
CGCCTGTAATCCCAACACTTTGGGAGGCCAAGGCAGGTGGATCATGAGGTCAGGAGATCGAAACCAT 
CCTGGCTAACACGGTGAAACCCCATCTCTACTAAAAATACAAAAAATTAGCAGGGCGAGGTGGCAGG 
CGCCTATAATCCCAG9TACTCAGGAGGCTGAGGAAGGAGAATCGCTTGAACCCGGGAGGTGGAGCTT 
GCAGTGAGCTGAGATCACACCACTGCACTCCAGCCTGGGCAACAGAG 

PRESET ATG at 201 | joRF Stop: TAA at 1830 





SEQ ID NO: 90 


543 aa |MW at 60514.5kD 


NOV 17a, 
CG 145722-01 
Protein Sequence 


J©DKDIDKELRQKLNFSYCEETEIEGQKKVEESREASSQTPEKGEVQDSEAKGTPIWPLSNVHELD 
TSSEKDKESPDQILRTPVSHPLKCPETPAQPDSRSKIiLPSDSPSTPKTMLSRLVISPTGKLPSRGPK 
HLKLTPAPLKDEMTSI^WINPFTPESyKKLFLQSGGKRKIRRCVLRET^IMASRYEKEFLEVEKIG 
VGEFGTVYKC IKRLDGC VYA IKRSMKTFTEL SNENSALHE\T¥AHAVI/GHHPHWR YYSSWAEDDHMI 
IQNEYCTGGSLQAAISEbTTCSGNHFEEPKLKDIL^ 

ESSGVIEEVENEADWFLSANVMYKIGDLGHATSINKPKVEEGDSRFLANEILQEDYRHLPKADIFAL 
GLTI AVAAGAESLPTNGAAWHHIRKGNF PDVPQELSES F SSLLKNMIQPDAEQRPS AAALARNTVLR 
PSLGKTEELQQQLNLEKFKTATLERELREAQQAQSPQGYTHHGDTGVSGTHTGSRSTKRLVGGKSAR 
SSSFTCE 



Further analysis of the NOV17a protein yielded the following properties shown in 
Table 17B. 

10 



Table 17B. Protein Sequence Properties NQV17a 


PSort analysis: 


0.4500 probability located in cytoplasm; 0.3000 probability located in 
microbody (peroxisome); 0.1000 probability located in mitochondrial matrix 
space; 0.1000 probability located in lysosome (lumen) 


SignalP analysis: 


No Known Signal Sequence Predicted 



A search of the NOV17a protein against the Geneseq database, a proprietary 
15 database that contains sequences published in patents and patent publication, yielded 
several homologous proteins shown in Table 17C. 



Table 17C. Geneseq Results for NOV17a 


Geneseq 
Identifier 


Protein/Organism/Length 
[Patent*, Date] 


NOV17a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for 
the Matched 
Region 


Expect 
Value 



167 



WO 03/029424 



PCT/US02/31373 



5 



AAB62519 


Xenopus Weel protein 
catalytic domain (residues 
210-443) - Xenopus sp, 240 
aa. [US6225101-B1, 
01-MAY-2001] 


188..431 ** 
1..240 


170/2'4l« S " 
191/244 (77%) 


* - 


AAY51401 


Xenopus sp. Weel catalytic 
domain protein fragment - 
Xenopus sp, 240 aa. 
[US6020194-A, 
01-FEB-20001 


188..431 
1..240 


170/244 (69%) 
191/244 (77%) 


le-94 


ABB60693 


Drosophila melanogaster 
polypeptide SEQ ID NO 
8871 - Drosophila 
melanogaster. 609 aa. 
[WO200171042-A2, 
27-SEP-2001] 


1O9..501 
101..551 


180/464(38%) 
257/464 (54%) 


9e-78 


AAY96776 


Z. mays partial weel kinase - 
7!pa niflv^ aa 
[WO200037645-A2, 
29-JUN-2000] 


185..465 
264..513 


103/282 (36%) 
153/282 (53%) 


3e^5 


AAY96770 


Z. mays partial weel kinase - 
Zea mays, 403 aa. 
[WO200037645-A2, 
29-JUN-2000] 


185..465 
142..391 


103/282 (36%) 
153/282 (53%) 


3e-45 


In a BLAST search of public sequence datbases, the NOV17a protein was found to 
have homology to the proteins shown in the BLASTP data in Table 17D. 


Table 17D. Public BLASTP Results for NOV17a 


Protein 

Accession 

Number 


Protein/Organism/Length 


NOV17a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for the 
Matched Portion 


Expect 
Value 


095017 


WUGSC:HJDJ0894A10.2 
protein - Homo sapiens 
(Human), 541 aa (fragment). 


1..541 
1..541 


541/541 (100%) 
541/541 (100%) 


0.0 


P47817 


Weel-like protein kinase (EC 
2.7.1.112) - Xenopus laevis 
(African clawed frog), 555 
aa. 


10..542 
11..552 


291/560(51%) 
352/560 (61%) 


e-143 


057473 


Weel homolog - Xenopus 
laevis (African clawed frog), 
554 aa. 


10..542 
11..551 


294/566(51%) 
357/566(62%) 


e-143 



168 



WO 03/029424 



PCT/US02/31373 



Q8QGV2 


WeelB kinase - Xenopus 
laevis (African clawed frog), 
595 aa. 


10..541 r 
20..593 


350/579 (60%) 




Q63802 


Weel-like protein kinase (EC 
2.7.1.1 12) -Rattus 
norvegicus (Rat), 646 aa. 


92..541 
168..644 


236/484 (48%) 
308/484 (62%) 


e-118 



PFam analysis predicts that the NOV17a protein contains the domains shown in the 
Table 17E. 

5 



Table 17E. Domain Analysis of NOV17a 


Pfam Domain 


NOV17a Match Region 


Identities/ 
Similarities 

for the Matched Region 


Expect Value 


pkinase 


194..462 


73/310(24%) 
193/310 (62%) 


6.4e-45 



Example 18. 

10 The NOV1 8 clone was analyzed, and the nucleotide and encoded polypeptide 

sequences are shown in Table 18 A. 



Table 18A. NOV18 Sequence Analysis 




SEQ ID NO: 91 |753 bp f 


NOV18a, 
CG145754-01 
DNA Sequence 


TCC CTTCTCCTGCCCCTGCAG ATCCTACTGCTATC CTTAGCCTTGGAAAC TG C AGGAGAAGAAGCCC 
AGGGTGACAAGATTATTGATGGCGCCCCATGTGCAAGAGGCTCCCACCCATGGCAGGTGGCCCTGCT 
CAGTGGCAATCAGCTCCACTGCGGAGGCGTCCTGGTCAATGAGCGCTGGGTGCTCACTGCCGCCCAC 
TGCAAGATGAATGAGTACACCGTGCACCTGGGCAGTGATACGCTGGGCGACAGGAGAGCTCAGAGGA 
TCAAGGCCTCGAAGTCATTCCGCCACCCCGGCTACTCCACACAGACCCATGTTAATGACCTCATGCT 
CGTGAAGCTCAATAGCCAGGCCAGGCTGTCATCCATGGTGAAGAAAGTCAGGCTGCCCTCCCGCTGC 
GAACCCCCTGGAACCACCTGTACTGTCTCCGGCTGGGGCACTACCACGAGCCCAGATGTGACCTTTC 
CCTCTGACCTCATGTGCGTGGATGTCAAGCTCATCTCCCCCCAGGACTGCACGAAGGTTTACAAGGA 
CTTACTGGAAAATTCCATGCTGTGCGCTGGCATCCCCGACTCCAAGAAAAACGCCTGCAATGGTGAC 
TCAGGGGGACCGTTGGTGTGCAGAGGTACCCTGCAAGGTCTGGTGTCCTGGGGAACTTTCCCTTGCG 
GCCAACCCAATGACCCAGGAGTCTACACTCAAGTGTGCAAGTTCACCAAGTGGATAAATGACACCAT 
GAAAAAGCATCGCTAA 




ORF Start: at 1 | joRF Stop: TAA at 751 



15 





SEQ ID NO: 92 |250 aa |m\V at 27 166.0kD 


NOV18a, 
CG145754-01 


SLLLPLQILLLSLALBTAGEEAQGDKI IDGAPCARGSHPWQVALL SGNQLHCGGVLVNERWVLTAAP 
CKMNEYTVHLGSDTLGDRRAQRIi^SKSFRHPGYST^^ 

EPPGTTCTVSGWGTTTSPDVTFPSDLMCVDVKL I S PODCTKVYKDLL ENSMLCAGI PDSKKNACNGD 



169 



WO 03/029424 PCT/US02/31373 
Protein Sequence Jsggplvcrgtlqglvswgtfpcgqpndpgvytqvc 





SEQIDNO: 93 


862 bp 




NOV18b, 
CG145754-03 
DNA Sequence 


ACTGGGTCCGAATCAGTAGGTGACCCCGCCCCTGGATTCTGGAAGACCTCACCATGGGACGCCCCrfl 


ACCTCGTGCGGCCAAGACGTGGATGTTCCTGCTCTTACTGGGGGGAGCCTGGGCAGCCAGGGGTGAC 
AAGATTATTGATGGCGCCCCATGTGCAAGAGGCTCCCACCCATGGCAGGTGGCCCTGCTCAGTGGCA 
ATCAGCTCCACTGCGGAGGCGTCCTGGTCAATGAGCGCTGGGTGCTCACTGCCGCCCACTGCAAGAT 
GAATGAGTACACCGTGCACCTGGGCAGTGATACGCTGGGCGACAGGAGAGCTCAGAGGATCAAGGCC 
TCGAAGTCATTCCGCCACCCCGGCTACTCCACACAGACCCATGTTAATGACCTCATGCTCGTGAAGC 
TCAATAGCCAGGCCAGGCTGTCATCCATGGTGAAGAAAGTCAGGCTGCCCTCCCGCTGCGAACCCCC 
TGGAACC ACCTGTACTGTC TCCGGCTGGGGCAC TACC ACGAGCCC AGATGTGACCTTTCCCTC TGAC 
CTCATGTGCGTGGATGTCAAGCTCATCTCCCCCCAGGACTGCACGAAGGTTTACAAGGACTTACTGG 
AAAATTCCATGCTGTGCGCTGGCATCCCCGACTCCAAGAAAAACGCCTGCAATGGTGACTCAGGGGG 
ACCGTTGGTGTGCAGAGGTACCCTGCAAGGTCTGGTGTCCTGGGGAACTTTCCCTTGCGGCCAACCC 
AATGACCCAGGAGTCTACACTCAAGTGTGCAAGTTCACCAAGTGGATAAATGACACCATGAAAAAGC 
ATCGCTAACGCCACACTGAGTTAATTAACTGTGTGCTTCCAACAGAAAATGCACAGGA 




ORF Start: ATG at 54 


lORFStop: TAAat 810 





SEQIDNO: 94 


252 aa |MW at 27557.6kD 


NOV18b, 
CG145754-03 
Protein Sequence 


MGRPRPRAAKTWMFLLLLGGAWAARGDK 1 1 DG APC ARGSH PWQVALLSGNQLHCGGVLVNERWVLTA 
AHCKMNEYTVHLGSDTLGDRRAQR IKASK S FRHPG YSTQTHVNDLMLVKLNSQARL S SMVKKVRL PS 
RCEPPGTTC WSGWGTTTSPDWFPSDLMCVDVKLIS PQDCTKVYKDLLENSMLCAGI PDSKKNACN 
GDSGGPLVCRGTLQGLVSWGTFPCGQPNDPGVYTQVCKFTKWINDTMKKHR 





SEQIDNO: 95 |804bp | 


NOV18c, 
CG145754-02 
DNA Sequence 


GG ATTTCCGGGCTCC ATGGC AAGATCCCTTCTCCTGCCCCTGC AGATCCTAC TGCTATCCTTAGCC T 
TGGAAACTGCAGGAGAAGAAGCCCAGGGTGACAAGATTATTGATGGCGCCCCATGTGCAAGAGGCTC 
CCACCCATGGCAGGTGGCCCTGCTCAGTGGCAATCAGCTCCACTGCGGAGGCGTCCTGGTCAATGAG 
CGCTGGGTGCTCACTGCCGCCCACTGCAAGATGAATGAGTACACCGTGCACCTGGGCAGTGATACGC 
TGGGCGACAGGAGAGCTCAGAGGATCAAGGCCTCGAAGTCATTCCGCCACCCCGGCTACTCCACACA 
GACCC ATGTTAATGACCTC AAGC TC ATCTC CCCCCAGGACTGC ACGAAGGTTTACAAGGACTTACTG 
GAAAATTCC ATGCTGTGCGCTGGC ATCCC CGACTCCAAGAAAAACGCC TGC AATGGTGACTC AGGGG 
GACCGTTGGTGTGCAGAGGTACCCTGCAAGGTCTGGTGTCCTGGGGAACTTTCCCTTGCGGCCAACC 
CAATGACCCAGGAGTCTACACTCAAGTGTGCAAGTTCACCAAGTGGATAAATGACACCATGAAAAAG 
CATCGCTAACGCCACACTGAGTTAATTAACTGTGTGCTTCCAACAGAAAATGCACAGGAGTGAGGAC 


GCCGATGACCTATGAAGTC AAATTTGACTTTACCTTTCC TCAAAGATATATTT AAACCTCATGCCC T 


GTTG ATAAAC C AATCAAAT TGGT AAAG AC C TAAAACC AAAACAAATAAAG AAAC AC AAAAC C C TC AA 




ORF Start: ATG at 16 j |ORF Stop: TAA at 610 





SEQIDNO: 96 


198 aa |MWat21613.6kD 


NOV18c, 
CG145754-02 
Protein Sequence 


MARSLLLPLQILLLSLALETAGEEAQGDKIIDGAPCARGSHPWQVALLSGNQLHCGGVLVNERWVLT 

AAHCKMNEYTVHLGSDTLGDRRAQRIKASKSFRHPGYSTQTHVND^ 

CAGIPDSKKNACNGDSGGPLVCRGTLQGLVSWGTFPCGQPND 
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SEQ ID NO: 97 |544bp | 


NOV18d, 
252718128 DNA 
Sequence 


CACCGGATCCGAAGAAGCCC^GGGTGACAAGATTATTGATGGCGCCCCATGTGCAAGAGGCTCCCAC 
CCATGGCAGGTGGCCCTGCTCAGTGGCAATCAGCTCCACTGCGGAGGCGTCCTGGTCAATGAGCGCT 
GGGTGCTCACTGCCGCCCACTGCAAGATGAATGAGTACACCGTGCACCTGGGCAGTGATACGCTGGG 
CGACAGGAGAGCTCAGAGGATCAAGGCCTCGAAGTCATTCCGCCACCCCGGCTACTCCACACAGACC 
CATGTTAATGACCTCAAGCTCATCTCCCCCCAGGACTGCACGAAGGTTTACAAGGACTTACTGGAAA 
ATTCC ATGCTGTGCGCTGGCATCCCCGACTCCAAGAAAAACGC C TGC AATGGTGACTCAGGGGGACC 
GTTGGTGTGCAGAGGTACCCTGCAAGGTCTGGTGTCCTGGGGAACTTTCCCTTGCGGCCAACCCAAT 
GACCCAGGAGTCTACACTCAAGTGTGCAAGTTCACCAAGTGGATAAATGACACCATGAAAAAGCATC 
TCGAGGGC 




ORF Start: at 2 joRF Stop: end of sequence 





SEQ ID NO: 98 |181 aa |MW at 19683.2kD 


NOY18d, 
252718128 
Protein Sequence 


TGS EEAQGDK 1 1 DGAPC ARG SHPWQVALL SGNQLHCGGVLVNERWVLTAAHCKMNEYTVHLGSDTLG 
DRRAQRIKASKSFRHPGYSTQTHV1TOLKLI SPQDCTKVYKDLLENSMIjCAG IPDSKKNACNGDSGGP 
LVCRGTLQGLVSWGTFPCGQPNDPGWTQVCKFTKWINDTMKKHLEG 





SEQ ID NO: 99 |292bp | 


NOV18e, 
252718152 DNA 
Sequence 


CACCGGATCCGAAGAAGCCCAGGGTGACAAGATTATTGATGGCGCCCCATGTGCAAGAGGCTCCCAC 
CCATGGCAGGTGGCCCTGCTCAGTGGCAATCAGCTCCACTGCGGAGGCGTCCTGGTCAATGAGCGCT 
GGGTGCTCACTGK3CGCCCACTGCAAGATGAATGAGTACACCGTGCACCTGGGCAGTGATACGCTGGG 
CG AC AGG AGAGC TCAG AGGATC AAGGC CTCG AAGTC AT TCCGC C AC CC CGGCTACTCCACACAGACC 
CATGTTAATGACCTCCTCGAGGGC 




ORF Start: at 2 |ORF Stop: end of sequence 





SEQ ED NO: 100 |97 aa |MW at 10551.7kD 


NOV18e, 
252718152 
Protein Sequence 


TGSEEAQGDKIIDGAPCARGSHPWQVALLSGNQLHCGG 
DRRAQRIKASKSPRHPGYSTQTHVNDLLEG 





SEQ ID NO: 101 |742bp | 


NOV18f, 
247856668 DNA 
Sequence 


AGGCTCCGCGGCCGCCCCCTTCACCGGATCCGCCAGGGGTCACAAGATTATTGATGGCGCCCCATGT 
GCAAGAGGCTCCCACCCATGGCAGGTGGCCCTGCTCAGTGGC^ATCAGCTCCACTGCGGAGGCGTCC 
TGGTCAATGAGCGCTGGGTGCTCACTGCCGCCCACTGCAAGATGAATGAGTACACCGTGCACCTGOT 
C AGTGATACGCTGGGCGAC AGGAG AGCTGAG AGGATCAAGGCC TCG AAGTC AT TCCGCCACCC CGGC 
TACTCCACACAGACCCATGTTAATGACCTC^TGCTCGTGAAGCT<^ATAGCCAGGCCAGGCTGTCAT 
CCATGGTGAAGAAAGTCAGGCTGCCCTCCCGCTGCGAACCCCCTGGAACCACCTGTACTGTCTCCGG 
CTGGGGCACTACCACGAGCCCAGATGTGACCTTTCCCTCTGACCTCATGTGCGTGGATGTCAAGCTC 
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ATCTCCCCCCACMACTCCACGAAGGTTTACAACrcAC 

TCCCCGACTCCAAGAAAAACGCCTGCAATGGTGACTCAGGGGGACCGTTGGTGTGCAGAGGTACCCT 
GCAAGGTCTGGTGTCCTGGGGAACTTTCCCTTGCGGCCAACCCAATGACCCAGGAGTCTACACTCAA 
GTGTGCAAGTTCACCAAGTGGATAAATGACACCATGAAAAAGCATCGCCTCGAGGGCAAGGGTGGGC 
GCGCC 




ORF Start: at 2 JORF Stop: end of sequence 





SEQ ID NO: 102 (247 aa i MW at 26591.2kD 


NOV18f, 
247856668 
Protein Sequence 


GSAAAPFTGSARGDKIIIXSAPCARGSHPWQVALLSGNQLHCGGVLVNERWVLTAAHCKMNE^ 
SDTLGDRRAQRIKASKSFIUIPGYSTQTHVNDLI^VKLNSQARLSSMVKKVRLPSRCEPPGTTCWS^ 
WGTTT S PDVTF PSDLMCVDVKL I S PQDCTKVYKDLLENSMLC AGI PDSKKNACNGDSGGPLVCRGTL 
QGLVSWGTFPCGQPNDPGVYTQVCKFTKWINDTMKKHRLEGKGGRA 





SEQ ID NO: 103 )673 bp JL 


NOV18g, 
247856705 DNA 
Sequence 


AGGCTCCGCGGCCGCCCCCTTCACCGGATCCGCCAGGGGTGACAAGATTATTGATGGCGCCCCATGT 
GCAAGAGGCTCCCACCCATGGCAGGTGGCCCTGCTCAGTGGCAATCAGCTCCACTGCGGAGGCGTCC 
TGGTCAATGAGCGCTGGGTGCTCACTGCCGCCCACTGCAAGATGAATGAGTACACCGTGCACCTGGG 
CAGTGATACGCTGGGCGACAGGAGAGCTCAGAGGATCAAGGCCTCGAAGTCATTCCGCCACCCCGGC 
TACTCCACACAGACCCATGTTAATGACCTCATGCTCGTGAAGCTCAATAGCCAGGCCAGGCTGTCAT 
CCATGGTGAAGAAAGTCAGGCTGCCCTCCCGCTGCGAACCCCCTGGAACCACCTGTACTGTCTCCGG 
CTGGGGCACTACCACGAGCCC AGATGTGACC TTTCCCTCTGAC CTCATGTGCGTGGATGTC AAGCTC 
ATCTCCCCCCAGGACTGCACGAAGGTTTACAAGGACTTACTGGAAAATTCCATGCTGTGCGCTGGCA 
TCCCCGACTCCAAGAAAAACGCCTGCAATGGTGACTCAGGGGGACCGTTGGTGTGCAGAGGTACCCT 
GCAAGGTCTGGTGTCCTGGGGAACTTTCCCTTGCGGCCAACCCAATCTCGAGGGCAAGGGTGGGCGC 
GCC 




ORF Start: at 2 I®*®* ^ to P : enci °^ se( l uence 





SEQ ID NO: 104 |224 aa |MW at 238 13.0kD 


NOV18g, 
247856705 
Protein Sequence 


GSAAAPFTGSARGDK I IDGAPCARGSHPWQVALL SGNQLHCGGVLVNERWVLTAAHCKMNEYTVHLG 
SDTLGDRRAQRIKASKSFRHPGYSTQTHVNDLMLVKLNSQAR^ 

WGTTTS PDVTFPSDLMCVDVKL I S PQDCTKVYKDLLENSMLCAGI PDSKKNACNGDSGG PLVCRGTL 
QGLVSWGTFPCGQPNLEGKGGRA 



Sequence comparison of the above protein sequences yields the following sequence 
relationships shown in Table 18B. 



Table 18B. Comparison of NOV18a against NOY18b through NOV18g. 


Protein Sequence 


NOV18a Residues/ 
Match Residues 


Identities/ 

Similarities for the Matched Region 


NOV18b 


25..250 
27..252 


213/226 (94%) 
213/226(94%) 
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NOV18c 


16 


jgjr -r ZB 11 Sl0rev-3!JL37: 

I /O/ZJD ^ l*V70) 




19..198 


177/235 (74%) 


NOVl8d 


17 240 


1 IZIZ55 (lifo) 




1..178 


173/233 (73%) 


NOV18e 


17..111 


92/95 (96%) 




1..95 


93/95 (97%) 


NOV18f 


22..250 


215/229 (93%) 




11..239 


216/229 (93%) 


NOV18g 


22..230 


193/209 (92%) 




11..219 


194/209 (92%) 



Further analysis of the NOV18a protein yielded the following properties shown in 
Table 18C. 



Table 18C. Protein Sequence Properties NOV18a 


PSort analysis: 


0.6233 probability located in outside; 0.1000 probability located in 
endoplasmic reticulum (membrane); 0.1000 probability located in 
endoplasmic reticulum (lumen); 0.1000 probability located in microbody 
(peroxisome) 


SignalP analysis: 


Cleavage site between residues 20 and 21 



A search of the NOV 18a protein against the Geneseq database, a proprietary 
database that contains sequences published in patents and patent publication, yielded 
several homologous proteins shown in Table 18D. 



Table 18D. Geneseq Results for NOV18a 


Geneseq 
Identifier 


Protein/Organism/Length 
[Patent #, Date] 


NOV18a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for the 
Matched Region 


Expect 
Value 


AAU82740 


Amino acid sequence of 
novel human protease #39 - 
Homo sapiens, 253 aa. 
[WO200200860-A2, 
03-JAN-2002] 


1..250 
4..253 


250/250(100%) 
250/250 (100%) 


e-150 
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AAW05383 


Human amyloid precursor 
protein protease - Homo 
sapiens, 253 aa. 
[W09631122-A1, 
10-OCT-1996] 


1..250 h 
4..253 


250/250 (100%) 




AAR67888 


Human stratum corneum 
chymotrophic recombinant 
enzyme (SCCE) - Homo 
sapiens, 253 aa. 
[WO9500651-A, 
05-JAN-1995] 


1..250 
4..2S3 


250/250 (100%) 
250/250 (100%) 


e-150 


AAB21326 


Human HSCEE - Homo 
sapiens, 257 aa. 
[WO200053776-A2, 
14-SEP-2000] 


1..250 
4..257 


249/255 (97%) 
249/255 (97%) 


e-146 


AAB98502 


Human Stratum Corneum 
Chymotryptic Enzyme, 
SCCE, catalytic domain - 
Homo sapiens, 225 aa. 
[WO200129056-A1, 
26-APR-2001] 


26..250 
1..225 


225/225 (100%) 
225/225 (100%) 


e-136 



In a BLAST search of public sequence datbases, the NOV18a protein was found to 
have homology to the proteins shown in the BLASTP data in Table 18E. 
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Table 18E. Pub 


lie BLASTP Results for NOV18a 


Protein 

Accession 

Number 


Protein/Organism/Length 


NOV18a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for the 
Matched Portion 


Expect 
Value 


P49862 


Kallikrein 7 precursor (EC 
3.4.21.-) (Stratum corneum 
chymotryptic enzyme) 
(hSCCE) - Homo sapiens 
(Human), 253 aa. 


1..250 
4..253 


250/250(100%) 
250/250(100%) 


e-149 


AAH32005 


Kallikrein 7 (chymotryptic, 
stratum corneum) - Homo 
sapiens (Human), 253 aa. 


1..250 
4..253 


249/250(99%) 
249/250 (99%) 


e-148 


Q91VE3 


Thymopsin (Stratum 
corneum chymotryptic 
enzyme) - Mus musculus 
(Mouse), 249 aa. 


3..250 
5..249 


185/248 (74%) 
212/248 (84%) 


e-111 
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AAN03663 


Kallikrein 7 short variant 
protein - Homo sapiens 
(Human), 181 aa. 


| Pi 

70..250 

1..181 


181/181 (100%) 


effflf 3a ' ' 


Q9R048 


Stratum corneum 
chymotryptic enzyme - Mus 
musculus (Mouse), 234 aa 
(fragment). 


3..235 
S..234 


175/233 (75%) 
198/233 (84%) 


e-102 



PFam analysis predicts that the NOV18a protein contains the domains shown in the 
Table 18F. 



Table 18F. Domain Analysis of NOV18a 


Pfam Domain 


NOV18a Match Region 


Identities/ 
Similarities 

for the Matched Region 


Expect Value 


trypsin 


27..242 


93/262(35%) 
182/262(69%) 


3.8e-87 



Example 19. 

The NOV19 clone was analyzed, and the nucleotide and encoded polypeptide 
sequences are shown in Table 19 A. 



Table 19A. NOV19 Sequence Analysis 




SEQ ID NO: 105 |2028bp 


NOV19a, 
CG146279-01 
DNA Sequence 


TTGAGGACTTTATTATTATTTGGGTTCTTTTC^ 


TTCCAATCGAGACGCCAAGAAAACAGGTGAACTGGGATCCTAAAGTGGCCGTTCCCGCAGCAGCACC 
GGTGTGCCAGCCCAAGAGCGCCACTAACGGGCAACCCCCGGCTCCGGCTCCGACTCCAACTCCGCGC 
CTGTCCATTTCCTCCCGAGCCACAGTGGTAGCCAGGATGGAAGGCACCTCCCAAGGGGGCTTGCAGA 
CCGTCATGAAGTGGAAGACGGTGGTTGCCATCTTTGTGGTTGTGGTGGTCTACCTTGTCACTGGC^ 
TCTTGTCTTCCGGGCATTGGAGCAGCCCTTTGAGAGCAGCCAGAAGAATACCATCGCCTTGGAGAAG 
GCGGAATTCCTGCGGGATCATGTCTGTGTGAGCCCCCAGGAGCTGGAGACGTTGATCCAGCATGCTC 
TTG ATGC TG AC AATGCG GGAGTC AGTCC AATAGGAAAC TC T TCCAAC AAC AGCAGCCACTGGGAC C T 
CGGCAGTGCCTTTTTCTTTGCTGGAACTGTCATTACGACCATAGGGTATGGGAATATTGCTCCGAGC 
ACTGAAGGAGGCAAAATCTTTTGTATTTTATATGCCATCTTTGGAATTCCACTCTTTGGTTTCTTAT 
TGGCTGGAATTGGAGACCAACTTGGAACCATCTTTGGGAAAAGCATTGCAAGAGTGGAGAAGGTCTT 
TCGAAAAAAGCAAGTGAGTCZAGACCAAGATCCGGGTCATCTCAACCATCCTGTTCATCTTGGCCGGC 
TGCATTGTGTTTGTGACGATCCCTGCTGTCATCTTTAAGTACATCGAGGGCTGGACGGCCTTGGAGT 
CCATTTACTTTGTGGTGGTCACTCTGACCACGGTGGGCTTTGGTGATTTTGTGGCAGGGGGAAACGC 
TGGCATCAATTATCGGGAGTGGTATAAGCCCCTAGTGTGGT^ 

TTTGCAGCTGTCCTCAGTATGATCGGAGATTGGCTACGGGTTCTGTCCAAAAAGACAAAAGAAGAGG 

TGGGTGAAATCAAGGCCCATGCGGCAGAGTGGAAGGCCAATGTCACGGCTGAGTTCCGGG 

GCGAAGGCTCAGCGTGGAGATCCACGATAAGCTGCAGCGGGCGGCCACCATCCGCAGCATGGAGCGC 

CGGCGGCTGGGCCTGGACCAGCGGGCCCACTCACTGGACATGCTGTCCCCCGAGAAGCGCTCTGTCT 

TTGCTGCCCTGGACACCGGCCGCTTCAAGGCCTCATCCCAGGAGAGCATCAACAACCGGCCCAACAA 

CC TGCG C C TGAAGGGGCCGGAG C AGCTGAACAAGCATGGGCAGGGTGCGTC CGAGGACAACATC ATC 

AACAAGTTCGGGTCCACCTCCAGACTCACCAAGAGGAAAAACAAGGACCTCAAAAAGACCTTGCCCG 

AGGACGTTCAGAAAATCTACAAGACCTTCCGGAATTACTCCCTGGACGAGGAGAAGAAAGAGGAGGA 

GACGGAAAAGATGTGTAACTCAGACAACTCCAGCACAGCCATGCTGACGGACTGTATCCAGCAGCAC 
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GCTGAGTTGGAGAACGGAATGATACC^^ 

TTGAAGACAGAAACTAAATGTGAAGGACATTGGTCTTGGACTGAGCGTTGTGTGTGTGTGTGTGTGT 


GTTTTTAATATTCACACTGAGACATGTGCCTTAAACAGACTTTTTAGTCCAAAATTACATAGCATTG 


AAGAATATATTTCACTGTGCCATAAACAACTGAAAGCTTGCTCTGCCAAAAGGAATCAGAGAACAAG 


AACTTCATTTCAGATAGCAAACGCAGGACACACCAAGAGTGTCCGTGCACGTAGCCGGTTCTGGCCG 


TACATGTTAAGGGCATTTC AGTGGCAGTGCTGTAC CCCTGGGCAGTGC TACCTGGGCACACACGTAG 


ACAAGGGCAGCTATTCCT 




ORF Start: ATG at 6 1 j |oRF Stop; TAA at 1690 





SEQ ID NO: 106 |543 aa |MW at 60334.6kD 


NOV19a, 
CG146279-01 
Protein Sequence 


MKF PI ETPRKQVNWDPKVAVP AAAPVC QPKS ATNGQPP APAPTPTPRLS I SSRATWARMEGTSQGG 
LQTVMKWKTWA I FVVVVVYL VTGGLVFRALEQPFES SQKNT I ALEKAEFLRDHVC VSPQELETL I Q 
HALDADWAGVSPIGNSSNNSSHWDLGSAFFFAGTVITTIGYGNIAPSTEGGKIFCILYAIFGIPLFG 
FLLAG I GDQLGT I FGKS I ARVEKVFRKKQVSQTK IRVI ST ILF I LAGC I VFVTI PAVI FKYI EGWTA 
LESIYFVVVTLTTVGFGDFVAGGNAGINYREVTOKPLVWWIL^ 
EEVGEIKAHAAFAfKANVTAEFRETRRRLSVEIHDKLQR^^ 

SVFAALDTGRFKASSQES INNRPNNLRLKGPEQLNKHGQGASEDNI INKFGSTSRLTKRKNKDLKKT 
LPEDVQK I YKTFRNYS LDEEKKEEETEKMCNSDNS STAMLTDC I QQHAELENGMI PTDTKDREPENN 
SLLEDRN 



Further analysis of the NOV 19a protein yielded the following properties shown in 
Table 19B. 
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Table 19B. Protein Sequence Properties NOV19a 


PSort analysis: 


0.6000 probability located in plasma membrane; 0.4000 probability located in 
Golgi body; 0.3000 probability located in endoplasmic reticulum (membrane); 
0.3000 probability located in microbody (peroxisome) 


SignalP analysis: 


No Known Signal Sequence Predicted 



A search of the NOV 19a protein against the Geneseq database, a proprietary 
15 database that contains sequences published in patents and patent publication, yielded 
several homologous proteins shown in Table 19C. 



Table 19C. Geneseq Results for NOV19a 


Geneseq 
Identifier 


Protein/Organism/Length 
[Patent*, Date] 


NOV19a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for the 
Matched Region 


Expect 
Value 
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AAU81354 


Novel human ion channel 
protein #34 - Homo sapiens, 
543 aa. [WO200185788-A2, 
15-NOV-2001] 


F 

1..543 

1..543 


5*3Mc«F< 
543/543 (100%) 




AAU79472 


Human novel transporter 
protein - Homo sapiens, 543 
aa. [WO200224748-A2, 
28-MAR-2002] 


1..543 
1..543 


543/543 (100%) 
543/543 (100%) 


0.0 


AAU79473 


Human novel transporter 
protein variant - Homo 
sapiens, 543 aa. 
[WO200224748-A2, 
28-MAR-2002] 


1..543 
1..543 


542/543 (99%) 
543/543 (99%) 


0.0 


AAE16596 


Human TWBC-Related K+ 
channel-2 (TREK-2) protein 
- Homo sapiens, 538 aa. 
[WO200200715-A2, 
03-JAN-2002] 


18..543 
13..538 


526/526(100%) 
526/526 (100%) 


0.0 


AAB47930 


Human TREK2 - Homo 
sapiens, 538 aa. 
[WO200200715-A2, 
03-JAN-2002] 


18..543 
13..538 


526/526 (100%) 
526/526 (100%) 


0.0 


In a BLAST search of public sequence datbases, the NOV19a protein was found to 
have homology to the proteins shown in the BLASTP data in Table 19D. 


Table 19D. Public BLASTP Results for NOV19a 


Protein 

Accession 

Number 


Protein/Organism/Length 


NOV19a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for the 
Matched Portion 


Expect 
Value 


Q8TDK7 


Potassium channel TREK2 
splice variant b - Homo 
sapiens (Human), 543 aa. 


1..543 
1..543 


542/543 (99%) 
542/543 (99%) 


0.0 


P57789 


Potassium channel subfamily 
K member 10 (Outward 
rectifying potassium channel 
protein TREK-2) (TREK-2 
K+ channel subunit) - Homo 
sapiens (Human), 538 aa. 


18..543 
13..538 


526/526(100%) 
526/526 (100%) 


0.0 


Q8TDK8 


Potassium channel TREK2 
splice variant a - Homo 
sapiens (Human), 543 aa. 


18..543 
18..543 


525/526 (99%) 
525/526(99%) 


0.0 
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Q9JIS4 


Potassium channel subfamily 
K member 10 (Outward 
rectifying potassium channel 
protein TREK-2) (TREK-2 
K+ channel subunit) - Rattus 
norvegicus iKaij, jjo aa. 


1..543 
1..538 


520/544 (95%) 
529/544 (96%) 


0.0 


P97438 


Potassium channel subfamily 
K member 2 (Outward 
rectifying potassium channel 
protein TREK-1) (Two-pore 
potassium channel TPKC1) 
(TREK-1 K+ channel 
subunit) - Mus musculus 
(Mouse), 411 aa. 


22..404 
2..369 


247/384(64%) 
301/384(78%) 


e-136 



PFam analysis predicts that the NOV 19a protein contains the domains shown in the 
Table 19E. 
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Table 19E. Domain Analysis of NOV19a 


Pfam Domain 


NOV19a Match Region 


Identities/ 
Similarities 

for the Matched Region 


Expect Value 


ionjxans 


158.323 


41/231 (18%) 
119/231 (52%) 


0.046 



Example 20. 

10 The NOV20 clone was analyzed, and the nucleotide and encoded polypeptide 

sequences are shown in Table 20A. 



Table 20A. NOV20 Sequence Analysis 




SEQIDNO:107 |2958bp | 


NOV20a, 
CG146374-01 
DNA Sequence 


GCTCCTCCCCGCTGGCGGGGGGAGAAAGGGCAGGAGGCCTTCCGTCCCGGCTATAAAGGGCCCCGGA 


CCGCCGCGGCTCGCCTCGGCTTGCCTCGACACGCCTAGGCGCCCTCCGGCTCCGCCCTAGCCGCCGC 


GTCCCAGCTAGAGCTCCAGCGCCCGCTCAGGCCCCACTCGACCCTCTCGGGCCTCGGCTACTTGGAC 


TGCGGCGGAATATGGCGGCTCCGATGACTCCCGCGGCTCGGCCCGAGGACTACGAGGCGGCGCTCAA 
TGCCGCCCTGGCTGACGTGCCCGAACTGGCCAGACTCCTGGAGATCGACCCGTACTTGAAGCCCTAC 
GCC GTGG AC TT C C AGCGC AGGTAT AAGC AG TTT AGCC AAATTTTGAAGAACATTGG AGAAAATGAAG 
GTGGTATTGATAAGTTTTCCAGAGGCTATGAATCATTTGGCGTCCACAGATGTGCTGATGGTGGTTT 
ATACTGCAAAGAATGGGCCCCGGGAGCAGAAGGAGTTTTTCTTACTGGAGATTTTAATGGTTGGAAT 
CCAO^TTCGTACCCATACAAAAAACTGGATTATGGAAAATXX^AGCTGTATATCCCACCAAAGCAGA 
ATAAATCTGTACTCGTGCCTCATGGATCCAAATTAAAGGTAGTTATTACTAGTAAAAGCGGAGAGAT 
CTTGTATCGTATTTCACCGTGGGCAAAGTATGTGGTTCGTGAAGGTGATAATGTGAATTATGATTGG 
ATACAC TGGGATCC AGAACACTCATATGAGTTTAAGC ATTCCAGACC AAAGAAGCCACGGAGTC TAA 
GAATTTATGAATCTCATGTGGGAATTTCTTCCCATGAAGGAAAAGTAGCTTCTTATAAACATTTTAC 
ATGC AATGTAC TACC AAGAATCAAAGGCC TTGGATAC AACTGCATTC AGTTGATGGC AATC ATGGAG 
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CATGCTTACTATGCC 

CACCTGAAGAGCTACAAGAACTGGTAGACACAGCTCATTCCATGGGTATCATAGTCCTCTTAGATGT 
GGTACACAGCCATGCTTCAAAAAATTCAGCAGATGGATTGAATATGTTTGATGGGACAGATTCCTGT 
TATTTTCATTCTGGACCTAGAGGGACTCATGATCTTTGGGATAGCAGATTGTTTGCCTACTCCAGGT 
TGAATATTTCAGACATCTA AGCCAATTAGAATCATGATTGTTTTGATTGCCAGAAATCCTTAAATCT 
GGGAAGTTTTAAGATTCCTTCTGTCAAACATAAGATGGTGGTTGGAAGAATATCGCTTTGATGGATT 
TCGTTTTGATGGTGTTACGTCCATGCTTTATCATCACCATGGAGTGGGTCAAGGTTTCTCAGGTGAT 
TACAGTGAATATTTCGGACTACAAGTAGATGAAGATGCCTTGACTTACCTCATGTTGGCAAATCATT 
TGGTTCACACGCTGTGTCCCGATTCTATAACAATAGCTGAGGATGTATCAGGAATGCCAGCTCTGTG 
CTCTCCAATTTCCCAGGGAGGGGGTGGTTTTGACTATCGACTAGCCATGGCAATTCCAGATAAGTGG 
ATTCAGCTACTTAAAGAGTTTAAAGATGAAGACTGGAACATGGGCGATATAGTATACACGCTCACAA 
ACAGGCGCTACCTTGAAAAGTGCATTGCTTATGCAGAGAGCCATGATCAGGCATTGGTTGGGGATAA 
GTCGCTGGCATTTTGGTTGATGGATGCCGAT^ ATGTATACAAACATGAGTGTCCTGACTCCTTTTACT 
CC AGTTATTGATCGTGGAAT ACAGC TTC ATAAAATG ATTCGAC TC ATTACGC ATGGG C TTGGTGG AG 
AAGGCTATCTC AATTTCATGGGTAATGAATTTGGGC ATCCTGAATGGTTAGACTTC C C AAG AAAAGG 
AAATAATG AGAGTT ACC ATT ATGCC AGGCGGC AGTTTC ATTT AAC TGACGACGACC TTCT TCGCTAC 
AAGTTCCTAAATAATTTTGACAGGGATATGAATAGATTGGAAGAAAGATATGGTTGGCTTGCAGCTC 
CACAGGCCTACGTGAGTGAAAAACATGAAGGCAATAAGATCATTGCTTTTGAAAGAGCAGGTCTTCT 
TTTCATTTTCAACTTCCATCCAAGCAAGAGCTACACTGACTACCGAGTTGGAACAGCATTGCCAGGG 
AAATTCAAAATTGTGCTAGATTCAGATGCAGCGGAATATGGAGGGCATCAGAGACTGGACCACAGCA 
CTGACTTTTTTTCTGAGGCTTTTGAACATAATGGGCGTCCCTATTCTCTTTTGGTGTACATTCCAAG 
CAGAGTGGCCCTCATCCTTCAGAATGTGGATCTGCCGAATTGAAGAGGCCTGATTTCAGCTCCACCA 
GATGCAGATTTGTGTTTTGTTTTCTTGTTATCACTGTCACACAGCTTATAACATGTATGCTTTTCAG 
AAT AC AG T TGTC T AGC C AAGCC ATC AAG TG T C TG AAATT C AAT AT TGG T T TATGCAAAT ACAGC AAA 
CTTTTATTTAAGTAGATAGGAGAATATGTTTAAAATATTAGGAATCCTAGACCATATTTTCAAGTCA 
TCTTAGCAGCTAGGATTCTCAAATGGAAGTGTTATATATAATATGTTAAAAACATTTTGCTTTCCTG 
GCTAATTATTTGATCCTTTTAAATTCAAATTTGAATCATTTGTCATGTATGATTATTTCTGTTAAAT 
GTACACAGTATTTAAGATGGATATTTGGTGGCTCTATTTGTTCTGATATCTTTTGGTCTAAATTATG 
AGGTACCAAGATTGTTTCTTTGTTTCTTTTTTTCAAATTGTGTTTAGAAATACTGTAATAAATATGC 
AGTAGTGATATAAAGAATTATATCCAAGGTAATATAAAAGCCATTACGTATGAACTCAAAj^AAAAAA 
AAAAAAAAAA 

QRF Start: ATG at 213 \ jORF Stop: TAA at 1224 





SEQ ID NO: 108 |337 aa JMW at 38247.8kD 


NOV20a, 
CG146374-01 
Protein Sequence 


MAAPMTPAARPEDYEAALNAALADV^ 

KFSRGYESFGVHRCADGGLYCKEWAPGAEGVFLTGDFNGWNPFS Y P YKKLDYGKWEI/YI P PKQNKSV 
LVPHGSKLKWITSKSGEILYRISPWAKYVTOEGDlWNyDWIHWDPEHSYEFKHSRPKKPRSLRIYE 
SHVGI S SHEGKVAS YKHFTCNVLPRIKGLGYNC IQLMAIMEHAYYASFG YQ I T SF F AAS SR YGS PEE 
LQELVDTAHSMGI IVLLDVVHSHASKNSAIX3LNMFIX3TDSCYFHSGPRGTHDLWDSRLFAYSRLNI S 
DI 



Further analysis of the NOV20a protein yielded the following properties shown in 
Table 20B. 



Table 20B. Protein Sequence Properties NOV20a 


PSort analysis: 


0.7480 probability located in microbody (peroxisome); 0.6000 probability 
located in nucleus; 0.1000 probability located in mitochondrial matrix space; 
0.1000 probability located in lysosome (lumen) 


SignalP analysis: 


No Known Signal Sequence Predicted 
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A search of the NOV20a protein against the GeneSe^-dAafca^,^ f>MprietzSy 
database that contains sequences published in patents and patent publication, yielded 
several homologous proteins shown in Table 20C. 



Table 20C. Geneseq Results for NOV20a 


Geneseq 
Identifier 


Protein/Organism/Length 
[Patent #, Date] 


NOV20a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for 
the Matched 
Region 


Expect 
Value 


AAB90803 


Human shear stress-response 
protein SEQ ID NO: 106 - 
Homo sapiens, 702 aa. 
[WO200125427-A1, 
12-APR-2001] 


1..330 

1..330 • 


328/330(99%) 
329/330(99%) 


0.0 


ABB60350 


Drosophila melanogaster 
polypeptide SEQ ID NO 
7842 -Drosophila 
melanogaster, 865 aa. 
[WO200171042-A2, 
27-SEP-2001] 


22..329 
1..314 


170/314 (54%) 
227/314(72%) 


e-102 


AAB49603 


Glycogen branching enzyme 
amino acid sequence - 
Aspergillus nidulans, 686 aa. 
[JP2000279180-A, 
1O-OCT-2000] 


31..329 
12..314 


175/305 (57%) 
228/305 (74%) 


le-98 


AAG39093 


Arabidopsis thaliana protein 
fragment SEQ ID NO: 48322 
- Arabidopsis thaliana, 721 
aa. [EP1033405-A2, 
06-SEP-2000] 


30..329 
22..321 


161/302 (53%) 
214/302 (70%) 


3e-92 


AAG39092 


Arabidopsis thaliana protein 
fragment SEQ ID NO: 48321 
- Arabidopsis thaliana, 858 
aa. [EP1033405-A2, 
06-SEP-2000] 


30..329 
159..458 


161/302 (53%) 
214/302 (70%) 


3e-92 



In a BLAST search of public sequence datbases, the NOV20a protein was found to 
have homology to the proteins shown in the BLASTP data in Table 20D. 



Table 20D. Public BLASTP Results for NOV20a 
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Protein 

Accession 

Number 


Protein/Organism/Length 


1 P" 

NOV20a 

Residues/ 

Match 

Residues 


c i! .<•■ "ursaroriEr? 
Identities/ 
Similarities for the 
Matched Portion 


tMlAi vi^> «Wull M Ml 

Expect 
Value 


Q96EN0 


Similar to glucan 
(1,4-alpha-), branching 
enzyme 1 (glycogen 
branching enzyme, Andersen 
disease, glycogen storage 
disease type IV) - Homo 

sar>ien<; f'Hirman^ 707 a a 


1..330 
1..330 


330/330(100%) 
330/330 (100%) 


0.0 


Q04446 


1,4-alpha-glucan branching 
enzyme (EC 2.4.1.18) 
(Glycogen branching 

pn7vmp^ fRratiohpr ph7vttip1 

- Homo sapiens (Human), 
702 aa. 


1..330 
1..330 


328/330 (99%) 
329/330 (99%) 


0.0 


09D6Y9 


23 1 0045H1 9Rik nrotein 
(RIKEN cDNA 2310045H19 
gene) - Mus musculus 
(Mouse), 702 aa. 


1..330 


701A^fl fftR<3f>"\ 
£*y ii jju \oovoj 

310/330(93%) 


17Q 


AAF58416 


CG4023-PA - Drosophila 
melanogaster (Fruit fly), 685 
aa. 


22..329 
1..314 


170/314(54%) 
227/314 (72%) 


e-102 


Q9V6K7 


CG4023 protein - Drosophila 
melanogaster (Fruit fly), 865 
aa. 


22..329 
1..314 


170/314 (54%) 
227/314(72%) 


e-102 



PFam analysis predicts that the NOV20a protein contains the domains shown in the 
Table 20E. 

5 



Table 20E. Domain Analysis of NO V20a 


Pf am Domain 


NOV20a Match Region 


Identities/ 
Similarities 
for the Matched 
Region 


Expect Value 


isoamylase_N 


73..168 


31/123 (25%) 
64/123(52%) 


5.1e-ll 
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Example 21. 

The NOV21 clone was analyzed, and the nucleotide and encoded polypeptide 
sequences are shown in Table 21 A. 

5 . 



Table 21A. NOV21 Sequence Analysis 




SEQIDNO:109 (885 bp | 


NOV21a, 
CG146403-01 
DNA Sequence 


TGGATQCTGGCGGTCCTCTACCTGGTCTGGCTCTATTGGGATAGAAACATACCCAGGGCTGGTGGAA 
r*r*n r*> «mr<r>^ i\r>rnnr^^ n* a a nfi a Apr* (^nttri ATTTnfS Af2AP A APTAAfiGnATTATTATCCTGTCAAGCT 
GGTGAAAACAGCAGAGCTGCCCCCGGATCGGAACTACGTGCTGGGCGCCCACCCTCATGGGATCATG 
TGTACAGGCTTCCTCTGTAATTTCTCCACCGAGAGCAATGGCTTCTCCCAGCTCTTCCCGGGGCTCC 
GGCCCTGGTTAGCCGTGCTGGCTGGCCTCTTCTACCTCCCGGTCTATCGCGACTACATCATGTCCTT 
TGGTCTCTGTCCGGTGAGCCGCCAGAGCCTGGACTTCATCCTGTCCCAGCCCCAGCTCGGGCAGGCC 
GTGGTCATCATGGTGGGGGGTGCGCACGAGGCCCTGTATTCAGTCCCCGGGGAGCACTGCCTTACGC 
TCCAGAAGCGCAAAGGCTTCGTGCGCCTGGCGCTGAGGCACGGGGCGTCCCTGGTGCCCGTGTACTC 
CTTTGGGGAG AATGAC ATC TTTAGACTTAAGGC TTTTGCC AC AGGCTCC TGGCAGCATTGGTGCCAG 
CTCACCTTCAAGAAGCTCATGGGCTTCTCTCCTTGCATCTTCTGGGGTCGCGGTCTCTTCTCAGCCA 
CCTCCTGGGGCCTGCTGCCCTTTGCTGTGCCCATCACCACTGTGGGTGAGCCCATCCCCGTCCCCCA 
GCGCCTCCACCCCACCGAGGAGGAAGTCAATCACTATCACGCCCTCTACATGACGGCCCTGGAGCAG 
CTCTTCGAGGAGCACAAGGAAAGCTGTGGGGTCCCCGCTTCCACCTGCCTCACCTTCATCTAGGCCT 
GGCCGCGGCCTTTC 




ORF Start: ATG at 4 | |ORF Stop: TAG at 865 








SEQ ID NO: 1 10 |287 aa |MW at 32641.7kD 


NOV21a, 
CG146403-01 
Protein Sequence 


MLAVL YL VWLYWDRN I PRAGGIU^ SEWIRNRAI WRQLRDYYPVKLVKTAELPPDRNYVLGAHPHG IMC 
TGFLCNFSTESNGFSQLFPGLRPWLAVLAGLFYLPVYRDYIMSFGLCPVSRQSLDFILSQPQLGQAV 
VIMVGGAHEAIjYSVPGEHCLTLQKI^GFVRLALRHGASLVPVYSFGENDI FRLKAFATGSWQHWC ql 
TFKKLMGFS PCIFWGRGLFS ATSWGLLPFAVP I TTVGEP I PVPQRLHPTEEEVNHYHALYMTALEQL 
FEEHKESCGVPASTCLTFI _ tmm n 



Further analysis of the NOV21a protein yielded the following properties shown in 
Table 21B. 

15 



Table 21B. Protein Sequence Properties NOV21a 


PSort analysis: 


0.5500 probability located in endoplasmic reticulum (membrane); 0.3814 
probability located in lysosome (lumen); 0.3200 probability located in 
microbody (peroxisome); 0.1000 probability located in endoplasmic reticulum 
(lumen) 


SignalP analysis: 


Cleavage site between residues 22 and 23 



A search of the NOV21a protein against the Geneseq database, a proprietary 
20 database that contains sequences published in patents and patent publication, yielded 
several homologous proteins shown in Table 21C. 
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Table 21C. Gene 


seq Results for NOV21a 








Geneseq 
Identifier 


Protein/Organism/Length 
[Patent #, Date] 


NOV21a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for the 
Matched Region 


Expect 
Value 


AAM80262 


Human protein SEQ ID NO 
3908 - Homo sapiens, 223 
aa. [WO200157190-A2, 
09-AUG-2001] 


43..237 
29.-223 


195/195 (100%) 

IJO/ISO (xv)\)7o) 


e-115 


ABB75677 


Breast protein-eukaryotic 
conserved gene 1 
(BSTP-ECG1) protein - 
Homo sapiens, 388 aa, 
[WO200208260-A2, 
31-JAN-2002] 


1..284 
101. .385 


158/285 (55%) 
218/285 (7o%) 


le-97 


AAB66170 


Protein of the invention #82 - 
Unidentified, 388 aa. 
[WO200078961-A1, 
28-DEC-2000] 


1..284 
101..385 


158/285 (55%) 
218/285 (76%) 


le-97 


AAU29191 


Human PRO polypeptide 
sequence #168 - Homo 
sapiens, 388 aa. 
[WO200168848-A2, 
20-SEP-2001] 


1..284 


158/285 (55%) 


le-97 


AAY99421 


Human PR01433 (UNQ738) 
amino acid sequence SEQ ID 
NO:292 - Homo sapiens, 388 
aa. [WO2G0012708-A2, 
09-MAR-2000] 


1..284 
101..385 


158/285 (55%) 
218/285 (76%) 


le-97 



5 In a BLAST search of public sequence datbases, the NOV21 a protein was found to 

have homology to the proteins shown in the BLASTP data in Table 2 ID. 



Table 21D. Public BLASTP Results for NOV21a 


Protein 

Accession 

Number 


Protein/Organism/Length 


NOV21a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for 
the Matched 
Portion 


Expect 
Value 


Q9XJDW7 


WUGSC:HJ)J0747G18.5 
protein - Homo sapiens 
(Human), 261 aa (fragment). 


43..287 
16..261 


244/246(99%) 
244/246(99%) 


e-145 
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CAD38961 


Hypothetical protein - Homo 

sapiens ^xiuman ) t HD** aa 

(fragment). 


R— 

1..284 


cr u .•• «u oi 0 ic .. 
158/285 (55%) 

£,10/ **0 J \f\Jfvj 


Ul«lt » Aii HP«U • 

3e-97 


Kjyor u 1 


L/iacyigiyceroj 
acyltransferase 2 
(Hypothetical 43.8 kDa 
protein) - Homo sapiens 
(Human), 388 aa. 


l..Z.O*T 

101..385 


XJOI £>\J-J \J*J f V J 

218/285 (76%) 


3e-97 


Q9BYE5 


GS1999full protein - Homo 
sapiens (Human), 297 aa. 


1..284 
10..294 


158/285 (55%) 
218/285 (76%) 


3e-97 


Q9DCV3 


0610010B06Rik protein 
(Diacylglycerol 
acyltransferase 2) -Mus 
musculus (Mouse), 388 aa. 


1..284 
101..385 


159/285 (55%) 
217/285 (75%) 


8e-97 



Example 22. 

The NOV22 clone was analyzed, and the nucleotide and encoded polypeptide 
sequences are shown in Table 22A. 



Table 22A. NOV22 Sequence Analysis 




SEQIDNOrlll |ll35bp | 


NOV22a, 
CG146513-01 
DNA Sequence 


ScAGTAAGAGATTATAGCAAAGC^TCTATAATCAACTC^^ 


GGCTTCTTGCCACAACAGAACAGCACCATAACCATGGCTTTCTTCTCCCGACTGAATCTCCAGGAGG 


GCCTCCAAACCTTCT V TTGTTTTGCAATGGATCCCAGTCTATATATTTTTAGGAGCTATTCCCATTCT 
CCTTATACCCTACTTTCTGTTATTCAGTAAGTTCTGGCCCTTGGCTGTGCTCTCCTTAGCCTGGCTC 
ACC TATGATTGGAAC ACCCAC AGTCAAGGTGGCAGGCGTTC AGCTTGGGTACGAAACTGGACCC TAT 
GGAAGTATTTC CGAAATTACTTCCCAGTACAGCTGGTGAAGACTCATGATCTTTC TCCCAAAC AC AA 
CTACATCATTGCCAATCACCCCCATGGCATTCTCTCTTTTGGTGTCTTCATCAACTTTGCCACTGAG 
GCCACTGGCATTGCTCGGATTTTCCCATCCATCACTCCCTTTGTAGGGACCTTAGAAAGGATATTTT 
GGATCC C AATTGTGCGAG AATATGTGATGTCAATGGGTGTGTGCCC TGTGAGTAGCTCAGCC TTGAA 
GTACTTGCTGACCCAGAAAGGCTCAGGCAATGCCGTGGTTATTGTGGTGGGTGGAGCTGCTGAAGCT 
CTCTTGTGCCGACCAGGAGCCTCCACTCTCTTCCTCAAGCAGCGTAAAGGTTTTGTGAAGATGGCAC 
TGCAAACAGGGGCATACCTTGTCCCTTCATATTCCTTTGGTGAGAACGAAGTTTTCAATCAGGAGAC 
CTTCCCTGAGGGCACGTGGTTAAGGTTGTTCCAAAAAACCTTCCAGGACACATTCAAAAAAATCCTG 
GGACTAAATTTCTGTACCTTCCATGGCCGGGGCTTCACTCGCGGATCCTGGGGCTTCCTGCCTTTCA 
ATCGGCCCATTACCACTGTTGGGGAACCCCTTCCAATTCCCAGGATTAAGAGGCCAAACCAGAAGAC 
AGTAGACAAGTATCACGCACTCTAGATCAGTGCCC 

TATGGCCTCCCTGAGACCCAAGAGCTGACAATTACATAACAGGAGCCACATTCCCCATTGATC 




ORF Start: ATG at 101 | |ORF Stop: TAA at 1 109 






SEQ ID NO: 1 12 |336 aa |MW at 38493.6kD 


NOV22a, 
CG146513-01 
Protein Sequence 


MAFF SRLNLQEGLQTFFVLQWI PWIFLGAI PILLI PYFLLFSKFWPIAVL SIiAWLTYDWNTHSQGG 

RR S AWVRNWTLWKYFRNYF PVQLVKTHDL S FKHNYI I ANHPHG I L SFGVF INF ATKATGI AR I F P S I 

TPFVGTLERIFWIPIVREYVMSMGVCPVSSSALKYI^TQ^^ 

LKQRXGFVKMALQTGAYLVPSYSFGENEVFNQETFPEGTWLRLFQKTFQDTF^ 

FTRGSWGFLPFNRPITTVGEPLPIPRIKRPNQKTVDKV1CVLYISALRKLFDQHKVEYGLPETQELTI 

T 
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Further analysis of the NOV22a protein yielded the following properties shown in 
Table 22B. 

5 



Table 22B. Protein Sequence Properties NOV22a 


PSort analysis: 


0.6850 probability located in plasma membrane; 0.6400 probability located in 
endoplasmic reticulum (membrane); 0.3880 probability located in microbody 
(peroxisome); 0.3700 probability located in Golgi body 


SignalP analysis: 


Cleavage site between residues 65 and 66 



A search of the NOV22a protein against the Geneseq database, a proprietary 
10 database that contains sequences published in patents and patent publication, yielded 
several homologous proteins shown in Table 22C. 



Table 22C. Geneseq Results for NOV22a 


Geneseq 
Identifier 


Protein/Organism/Length 
[Patent*, Date] 


NOV22a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for 
the Matched 
Region 


Expect 
Value 


AAM06866 


Human foetal protein, SEQ 
ID NO: 1074 -Homo 
sapiens, 225 aa. 
[WO200155339-A2, 
02-AUG-2001] 


1..216 
1..216 


211/216(97%) 
214/216 (98%) 


e-124 


ABB75677 


Breast protein-eukaryotic 
conserved gene 1 
(BSTP-ECG1) protein - 
Homo sapiens, 388 aa. 
[WO200208260-A2, 
31-JAN-2002] 


1..335 
S6..387 


171/337 (50%) 
237/337 (69%) 


e-101 


AAB66170 


Protein of the invention #82 - 
Unidentified, 388 aa. 
[WO200078961-A1, 
28-DEC-2000] 


L.335 
56..387 


171/337 (50%) 
237/337 (69%) 


e-101 


AAU29191 


Human PRO polypeptide 
sequence #168 - Homo 
sapiens, 388 aa. 
[WO200168848-A2, 
20-SEP-2001] 


1..335 
56..387 


171/337 (50%) 
237/337 (69%) 


e-101 
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AAY99421 


Human PR01433 (UNQ738) 
amino acid sequence SEQ ID 
NO:292 - Homo sapiens, 388 
aa. [WO200012708-A2, 
09-MAR-2000] 




1..335 
56..387 


171/337 (50%) 
237/337 (69%) 


umM «jUm wik j^' *- ■* 

e-101 



In a BLAST search of public sequence datbases, the NOV22a protein was found to 
have homology to the proteins shown in the BLASTP data in Table 22D. 



Table 22D. Public BLASTP Results for NOV22a 


Protein 

Accession 

Number 


Protein/Organism/Length 


NOV22a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for 
the Matched 
Portion 


Expect 
Value 


Q9DCV3 


0610010B06Rik protein 
(Diacylglycerol 
acyltransferase 2) - Mus 
musculus (Mouse), 388 aa. 


1..335 
56.387 


172/337 (51%) 
238/337 (70%) 


e-101 


CAD38961 


Hypothetical protein - Homo 
sapiens (Human), 434 aa 
(fragment). 


1..335 
102..433 


171/337 (50%) 
237/337 (69%) 


e-100 


Q96PD7 


Diacylglycerol 
acyltransferase 2 
(Hypothetical 43.8 kDa 
protein) - Homo sapiens 
(Human), 388 aa. 


1..335 
56..387 


171/337 (50%) 
237/337 (69%) 


e-100 


Q8TAB1 


BA351K23.5 (Novel protein) 
- Homo sapiens (Human), 
296 aa (fragment). 


38..335 
1..295 


161/299 (53%) 
221/299 (73%) 


2e-98 


Q9BYE5 


GS1999full protein - Homo 
sapiens (Human), 297 aa. 


39..335 
2..296 


161/299 (53%) 
217/299 (71%) 


4e-96 



Example 23. 

The NOV23 clone was analyzed, and the nucleotide and encoded polypeptide 
sequences are shown in Table 23A. 



Table 23A. NOV23 Sequence Analysis 

ISEQIDNO: 113 1 1022 bp 
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NOV23a, 
CG146522-01 
DNA Sequence 


ACTGTTCTGAGATCTTTCXICTCCCTCAGGCTCC^ 


CTTCCAGAGTCTGATGCTTCTGCAGTGGCCTTTGAGCTACCTTGCCATCTTGTTCGTCTACCTGCTG 
TTTACATCCTTGTGGCCGCTACCAGTGCTTTAC TTTGCCTGGTTGTTCC TGGAC TGGAAGACCCC AG 
AGCGAGGTGGCAGGCGTTCGGCCTGGGTAAGGAACTGGTGTGTCTGGACCCACATCAGGGACTATTT 
CCCC ATTATCCTG AAGAC AAAGG ACCTATCAC CTGAGC AC AACT ACCTC ATGGGGGTTC ACCC CC AT 
GGCCTCCTGACCTTTGGCGCCTTCTGCAACTTCTGCACTGAGGCCACAGGCTTCTCGAAGACCTTCC 
CAGGC ATCACTCC TC AC TTGGC C ACGC TGTCCTGGTTC TTCAAGATCCCC TTTGTTAGGG AGTACCT 
CATGGCCAAAGGTGTGTGCTCTGTGAGCCAGCCAGCC^TCAACTATCTGCTGAGCCATGGCACTGGC 
AACCTCGTGGGCATTGTAGTGGGAGGTGTGGGTGAGGCCCTGCAAAGTGTGCCCAACACCACCACCC 
TCATCCTCCAGAAGCGCAAGGGGTTCGTGCGCACAGCCCTCCAGCATGGGGCTCATCTGGTCCCCAC 
C TTC ACTTTTGGGGAAACTGAGGTGTATGATC AGGTGC TGTTCCATAAGGATAGC AGGATGTAC AAG 
TTCC AGAGCTGCTTCCGCCGTATCTTTGGTTTCTACTGTTGTGTCTTC T ATGGAC AAAGC TTCTGTC 
AAGGCTCCACTGGGCTCCTGCCATACTCCAGGCCTATTGTCACTGTTGGGGAGCCTCTGCCACTGCC 
CCAAATTGAAAAGCCAAGCCAGGAGATGGTGGACAAATACCATGCACTTTATATGGATGCTCTGCAC 
AAACTGTTCGACCAGCATAAGACCCACTATGGCTGCTCAGAGACCCAAAAGCTGTTTTTCCTGTGAA 
TGAAGGTACTGCATGCC 




ORF Start: ATG at 42 j |ORF Stop: TGA at 1002 






SEQ ID NO: 1 14 |320 aa |MW at 36773.5kD 


NOV23a, 
CG146522-01 
Protein Sequence 


MAHSKQPSHFQSLMLLQWPLSYLAILFVYLLFTSLOT 

VWTHIRDYFPIILKTKDLSPEHNYIMGVHPHGLLTFGAFCNFCTEATGFSKTFPGITPHLATLSWFF 
K I PFVREYLMAKGVC S VSQPAINYLLSHGTGNLVG I WGGVGEALQS VPNTTTL I LQKRKGFVRTAL 
QHGAHIiVPTFTFGETEVYDQVLFHKDSWIYKFQSCFRRIFGFYCCVFYGQSFCQGSTGLLPYSRPW 
WGEPLPLPQIEKPSQEMVDKYHALYMDAIjHKLFDQHKTHYGCSETQKLFFL 



Further analysis of the NOV23a protein yielded the following properties shown in 
Table 23B. 
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Table 23B. Protein Sequence Properties NOV23a 


PSort analysis: 


0.7284 probability located in outside; 0.3880 probability located in microbody 
(peroxisome); 0.1000 probability located in endoplasmic reticulum 
(membrane); 0.1000 probability located in endoplasmic reticulum (lumen) 


SignalP analysis: 


Cleavage site between residues 43 and 44 



A search of the NOV23a protein against the Geneseq database, a proprietary 
15 database that contains sequences published in patents and patent publication, yielded 
several homologous proteins shown in Table 23C. 



Table 23C. Geneseq Results for NOV23a 


Geneseq 
Identifier 


Protein/Organism/Length 
[Patent*, Date] 


NOV23a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for 
the Matched 
Region 


Expect 
Value 



187 



WO 03/029424 



PCT/US02/31373 



5 



ABB75677 


Breast protein-eukaryotic 
conserved gene 1 
(BSTP-ECG1) protein - 

[WO200208260-A2, 
31-JAN-2002] 


g~ 

4..317 
62..385 


165/324 (50%) 
224/324 (68%) 


J1J/' 
le-93 


AAB66170 


Protein of the invention #82 - 
Unidentified, 388 aa. 

28-DEC-2000] 


4..317 
62..385 


165/324 (50%) 
224/324 (68%) 


le-93 


AAU29191 


Human PRO polypeptide 
sequence #168 - Homo 
sapiens, 388 aa. 

20-SEP-2001] 


4..317 
62..385 


165/324 (50%) 
224/324 (68%) 


le-93 


AAY99421 


Human PR01433 (UNQ738) 
amino acid sequence SEQ 3D 
rNvJ.zyz - xiomo sdpieiib, joo 
aa. [WO200012708-A2, 
09-MAR-2000] 


4..317 
62..385 


165/324 (50%) 
224/324 (68%) 


le-93 


AAY94889 


Human protein clone 
HP02485 - Homo sapiens, 
334 aa. [WO200005367-A2, 
03-FEB-2000] 


11.319 
16..333 


144/318 (45%) 
200/318(62%) 


3e-74 


In a BLAST search of public sequence datbases, the NOV23a protein was found to 
have homology to the proteins shown in the BLASTP data in Table 23D. 


Table 23D. Public BLASTP Results for NOV23a 


Protein 

Accession 

Number 


Protein/Organism/Length 


NOV23a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for 
the Matched 
Portion 


Expect 
Value 


Q8TAB1 


BA351K23.5 (Novel protein) 
- Homo sapiens (Human), 
296 aa (fragment). 


30..317 
3..293 


163/291 (56%) 
214/291 (73%) 


5e-96 


Q9DCV3 


0610010B06Rik protein 
(Diacylglycerol 
acyltransferase 2) - Mus 
musculus (Mouse), 388 aa. 


4..317 
62.385 


166/324 (51%) 
225/324(69%) 


2e-93 


CAD38961 


Hypothetical protein - Homo 
sapiens (Human), 434 aa 
(fragment). 


4.317 
108..431 


165/324(50%) 
224/324(68%) 


3e-93 
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096PD7 


acyltransferase 2 
(Hypothetical 43.8 kDa 
protein) - Homo sapiens 
(Human), 388 aa. 


1 Pi 

62.385 


165/324 f50%"> 
224/324 (68%) 


tm^i mJLh. imJ* .1^ < 


Q9BYE5 


GS1999tull protein - Homo 
sapiens (Human), 297 aa. 


28..317 
1..294 


156/294 (53%) 
210/294 (71%) 


le-89 



Example 24. 

The NOV24 clone was analyzed, and the nucleotide and encoded polypeptide 
sequences are shown in Table 24A. 



Table 24A. NOV24 Sequence Analysis 




SEQIDNO:115 1 1056 bp 




NOV24a, 
CG146531-01 
DNA Sequence 


CATTTTCCAAAGGTGTCACAGGAAGAGCATGGCAGAGCTGGGACTGGGAGCCAGGTCACCATGGCTT 


TATATTTTTAGGTTTGTTCGTCTACCTGCTGTTTACATCCTTGTGGCCGCTACCAGTGCTTTACTTT 
GCCTGGTTGTTCCTGGACTGGAAGACCCCAGAGCGAGGTGGCAGGCGTTCGGCCTGGGTAAGGAACT 
GGTGTGTCTGGACCCACATCAGGGACTATTTCCCCATTCAGATCCTGAAGACAAAGGACCTATCACC 
TGAGCACAACTACCTCATGGGGGTTCACCCCCATGGCCTCCTGACCTTTGGCGCCTTCTGCAACTTC 
TGCACTGAGGCCACAGGCTTCTCGAAGACCTTCCCAGGCATCACTCCTCACTTGGCCACGCTGTCCT 
GGTTC TTCAAGATCCCCTTTGTTAGGGAGTACCTC ATGGCC AAAGGTGTGTGC TCTGTGAGCCAGCC 
AGCCATCAACTATCTGCTGAGCCATGGCACTGGCAACCTCGTGGGCATTGTAGTGGGAGGTGTGGGT 
G AGGCCCTGCAAAGTGTGCC CAACACC ACCACCC TC ATCCTCC AGAAGCGCAAGGGGTTCGTGCGCA 
C AGCCCTCCAGCATGGGGCTCATC TGGTCCC C ACC TTC ACTTTTGGGGAAACTGAGGTGTATGATCA 
GGTGCTGTTCCATAAGGATAGCAGGATGTACAAGTTCCAGAGCTGCTTCCGCCGTATCTTTGGTTTC 
TACTGTTGTGTCTTCTATGGACAAAGCTTCTGTCAAGGCTCCACTGGGCTCCTGCCATACTCCAGGC 
CTATTGTCACTGTTGGGGAGCCTCTGCCACTGCCCCAAATTGAAAAGCCAAGCCAGGAGATGGTGGA 
CAAATACCATGCACTTTATATGGATGCTCTGCACAAACTGTTCGACCAGCATAAGACCCACTATGGC 
TGCTCAGAGACCCAAAAGCTGTTTTTCCTGTGAATGAAGGTACTGCATGCC 




ORF Start: ATG at 61 j 


ORFStop: TGA at 1036 





SEQ JD NO: 1 16 |325 aa |MW at 37453.3kD 


NOV24a, 
CG146531-01 
Protein Sequence 


MAFF SRLNLQEGLQTFFVLQWI PVYI FLGLFVYLLF TSLWPL PVLYFAWLFLDWKTPERGGRRS AWV 
RNWCVWTHIRDYFPI QI LKTKDL S PEHNYLMGVHPHGLLTFGAFCNFCTEATGFSKTFPGI TPHLAT 
LSWFFK I PFVRE YLMAKGVC S VSQ PAINYLL SHGTGNLVGI WGGVGEALQS VPNTTTLI LQKRKGF 
VRTALQHGAHLVPTFTFGETEVYDQVLFHKDSRMYKFQ SC FRRI FGFYCCVFYGQS FCQGS TGLLPY 
SRPI VTVGE PLPLPQ I EK PS QEMVDKYHALYMDALHKLFDQHKTHYGCS ETQKLF FL 



Further analysis of the NOV24a protein yielded the following properties shown in 
Table 24B. 



Table 24B. Protein Sequence Properties NOV24a 
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PSort analysis: 


0.8200 probability located in outside; 03880 probability located in microbody 
(peroxisome); 0.1000 probability located in endoplasmic reticulum 
(membrane); 0.1000 probability located in endoplasmic reticulum (lumen) 


SignalP analysis: 


Cleavage site between residues 47 and 48 



A search of the NOV24a protein against the Geneseq database, a proprietary 
database that contains sequences published in patents and patent publication, yielded 
5 several homologous proteins shown in Table 24C. 



Table 24C. Geneseq Results for NOV24a 


Geneseq 
Identifier 


Protein/Organism/Length 
[Patent #, Date] 


NOV24a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for 
the Matched 
Region 


Expect 
Value 


ABB75677 


Breast protein-eukaryotic 

conserved gene 1 

(Bo l r-JWAj l ; protein - 

Homo sapiens, 388 aa. 

[WO200208260-A2, 

31-JAN-2002] 


1..322 
56..385 


166/330 (50%) 
230/330 (69%) 


2e-96 


AAB66170 


Protein of the invention #82 - 
Unidentified, 388 aa. 
[WO200078961-A1, 
28-DEC-2000] 


1..322 
S6..385 


166/330 (50%) 
230/330 (69%) 


2e-96 


AAU29191 


Human PRO polypeptide 
sequence #168 - Homo 
sapiens, 388 aa. 
[WO200168848-A2, 
20-SEP-2001] 


1..322 
56.-385 


166/330 (50%) 
230/330 (69%) 


2e-96 


AAY99421 


Human PR01433 (UNQ738) 
amino acid sequence SEQ ID 
NO:292 - Homo sapiens, 388 
aa. [WO200012708-A2, 
09-MAR-2000] 


1..322 
56..385 


166/330 (50%) 
230/330(69%) 


2e-96 


AAY94889 


Human protein clone 
HP02485 - Homo sapiens, 
334 aa. [WO200005367-A2, 
03-FEB-2000] 


13..324 
15..333 


147/321 (45%) 
200/321 (61%) 


2e-75 



10 In a BLAST search of public sequence datbases, the NOV24a protein was found to 

have homology to the proteins shown in the BLASTP data in Table 24D. 
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Table 24D. Public BLASTP Results for NO V24a 


Protein 

Accession 

Number 


Proteiii/Orgaiiism/Length 


NOV24a 
Residues/ 
Matcb 
Residues 


Identities/ 
Similarities for 
the Matched 
Portion 


Expect 
Value 


Q8TAB1 


BA351K23.5 (Novel protein) 
- Homo sapiens (Human), 
296 aa (fragment). 


34.322 
3..293 


163/291 (56%) 
215/291 (73%) 


le-97 


CAD38961 


Hypothetical protein - Homo 
sapiens (Human), 434 aa 
(fragment). 


1..322 
102..431 


166/330 (50%) 
230/330 (69%) 


6e-96 


Q9DCV3 


0610010B06Rik protein 
(Diacylglycerol 
acyltransferase 2) - Mus 
musculus (Mouse), 388 aa. 


1..322 

J0..3OD 


167/330 (50%) 


6e-96 


Q96PD7 


Diacylglycerol 
acyltransferase 2 
(Hypothetical 43.8 kDa 
protein) - Homo sapiens 
(Human), 388 aa. 


1..322 
56..38S 


166/330 (50%) 
230/330 (69%) 


6e-96 


Q9BYE5 


GS1999full protein - Homo 
sapiens (Human), 297 aa. 


32..322 
1..294 


157/294 (53%) 
211/294 (71%) 


le-91 



Example 25. 

The NOV25 clone was analyzed, and the nucleotide and encoded polypeptide 
sequences are shown in Table 25A. 



Table 25A. NOV25 Sequence Analysis 




SEQIDNO:117 l951bp | 


NOV25a, 
CG147274-01 
DNA Sequence 


ATGGGGCTTCGGGCAGGCCCCATCCTGCTTCTGCTGCTGTGGCTGCTGCCAGGGGCCCATTGGGATG 
TGCTGCCTTCAGAATGCGGCCACTCCAAGGAGGCCGGGAGGATTGTGGGAGGCCAAGACACCCAGGA 
AGGACGCTGGCCGTGGCAGGTTGGCCTGTGGTTGACCTCAGTGGGGCATGTATGTGGGGGCTCCCTC 
ATCCACCCACGCTGGGTGCTCACAGCCGCCCACTGCTTCCTGAGGTCTGAGGATCCCGGGCTCTACC 
ATGTTAAAGTCGGAGGGCTGACACCCTCACTTTCAGAGCCCCACTCGGCCTTGGTGGCTGTGAGGAG 
GC TC C TGGTCC AC T CC TC AT ACC ATGGGACC ACCAC C AGCGGGGACATTGCCC TGATGG AGCTGGAC 
TCCCCCTTGCAGGCCTCCCAGTTCAGCCCCATCTGCCTCCCAGGACCCCAGACCCCCCTCGCCATTG 
GGACCGTGTGCTGGGTAAACGGGCTGGGGCCCACATCACATCCAGCCCTGGCGAGTGTCCTTCAGGA 
GGTGG CTGTG C CC C TC CTGGAC TCGAAC ATGTGTGAGC TG ATGTACCAC CTAGGAGAGC C CAGCC TG 
GCTGGCCAGCGCCTCATCCAGGACGACATGCTCTGTGCTGGCTCTGTCCAGGGCAAGAAAGACTCCT 
GCCAGGGTGACTCCGGGGGGCCGCTGGTCTGCCCGATCAATGATACGTGGATCCAGGCCGGCATTG^ 




GACTGGATTCAGAGAACCCTGGCTGAATCTCACTCAGGCATGTCTGGGGCCCGCCCAGGTGCCCCAG 




TGGGTCCCTGTGA 




ORF Start: ATG at 1 | |ORF Stop: TGA at 949 
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SEQ ID NO: 118 |316 aa |MW at 33574.2kD 


NOV25a, 
CG147274-01 
Protein Sequence 


mglragpilllllv^lpgahtoviipsecghskeagrivggqdtqegrwpwqvglwltsvghvcggsl 
ihprwvltaahcflrsedpglyhvkvggltpslsephsalva\mrllrassyhgtttsgdialmeld 
splqasqfspiclpgpqtplaigtvcwvnglgptshpalasvlqevavplldsnmcelmyhlgepsl 
agqrliqddmu:agsvqgkkdscqgdsggplvcpindtwiqagivswgpgcarpfrpgvytqvlsyt 
dwiqrtlaeshsgmsgarpgapgshsgtsrshpvlllelltvcllgsl 



Further analysis of the NOV25a protein yielded the following properties shown in 
Table 25B. 



Table 25B. Protein Sequence Properties NOV25a 


PSort analysis: 


0.9190 probability located in plasma membrane; 0.3000 probability located in 
lysosome (membrane); 0.1000 probability located in endoplasmic reticulum 
(membrane); 0.1000 probability located in endoplasmic reticulum (lumen) 


SignalP analysis: 


Cleavage site between residues 22 and 23 



A search of the NOV25a protein against the Geneseq database, a proprietary 
database that contains sequences published in patents and patent publication, yielded 
several homologous proteins shown in Table 25C. 
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Table 25C. Geneseq Results for NOV25a 


Geneseq 
Identifier 


Protein/Organism/Length 
[Patent*, Date] 


NOV25a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for 
the Matched 
Region 


Expect 
Value 


AAU98887 


Human protease PRTS5 - 
Homo sapiens, 304 aa. 
[WO200238744-A2, 
16-MAY-2002] 


1..316 
1..304 


304/316(96%) 
304/316(96%) 


0.0 


AAW77W* 


Amino ariri Qftnupnce of 

SP002LA, a homologue of 
HELA2 - Homo sapiens, 289 
aa. [WO9836054-A1, 
20-AUG-1998] 


28..316 
1..289 


285/289 (98%) 
285/289 (98%) 


e-171 


ABG64545 


Human albumin fusion 
protein #1220 - Homo 
sapiens, 290 aa. 
[WO200177137-A1, 
18-OCT-2001] 


5..275 
6..276 


121/275 (44%) 
168/275 (61%) 


le-63 


AAB73945 


Human protease T - Homo 
sapiens, 290 aa. 
[WO200116293-A2, 
08-MAR-2001] 


5..275 
6..276 


121/275 (44%) 
168/275 (61%) 


le-63 


AAE03821 


Human gene 4 encoded 
secreted protein HWHIH10, 
SEQ ID NO: 67 -Homo 
sapiens, 290 aa. 
[WO200136440-A1, 
25-MAY-2001] 


5..275 
6..276 


121/275 (44%) 
168/275 (61%) 


le-63 



In a BLAST search of public sequence datbases, the NOV25a protein was found to 
5 have homology to the proteins shown in the BLASTP data in Table 25D. 



Table 25D. Public BLASTP Results for NOV25a 


Protein 

Accession 

Number 


Protein/Organism/Length 


NOV25a 

Residues/ 

Match 

Residues j 


Identities/ 
Similarities for 
the Matched 
Portion 


Expect 
Value 


Q91XC4 


Similar to distal intestinal 
serine protease - Mus 
musculus (Mouse), 310 aa. 


1..316 
1.310 


202/317 (63%) 
235/317 (73%) 


e-114 
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Q9QYZ9 


Distal intestinal serine 
protease - Mus musculus 
(Mouse), 310 aa. 


■ p 

1..316 
1..310 


261/317 (63%) 
233/317 (73%) 


" .-B-. —If j/r . 

e-113 


Q9BQR3 


Marapsin precursor (EC 
3.4.21.-) - Homo sapiens 
(Human), 290 aa. 


5..275 
6..276 


121/275 (44%) 
168/275 (61%) 


3e-63 


Q8R1A6 


RKEN cDNA 2010001P08 
gene - Mus musculus 
(Mouse), 331 aa. 


24..305 
41..329 


142/293 (48%) 
174/293 (58%) 


5e-62 


Q9DGR3 


Embryonic serine protease- 1 - 
Xenopus laevis (African 
clawed frog), 317 aa. 


25..304 
29..308 


123/288 (42%) 
165/288 (56%) 


le-59 



PFam analysis predicts that the NOV25a protein contains the domains shown in the 
Table 25E. 
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Table 25E. Domain Analysis of NOV25a 


Pfam Domain 


NO V25a Match Region 


Identities/ 
Similarities 

for the Matched Region 


Expect Value 


trypsin 


37..271 


109/266 (41%) 
176/266 (66%) 


1.7e-79 



Example 26* 

10 The NOV26 clone was analyzed, and the nucleotide and encoded polypeptide 

sequences are shown in Table 26A. 



Table 26A. NOV26 Sequence Analysis 




SEQIDNO:119 |970bp 


NOV26a, 
CG147351-01 
DNA Sequence 


CAC^GAACAATATGCAGCTGAGATOAGTAAAGCTATTGCTTTTGAGAT<^TTCAGAAATACGAGCC 
TATCGAAGAAGTTAGGAAAGCACACC^VAATGTCATTAGAAGGTTTTAC^GATACATGGATTCACGT 
GAATGTCTACTGTTTAAAAATGAATGTAGAAAAGTTTATCAAGATATGACTCATCCATTAAATGATT 
ATT TTATT TC ATCTTC ACATAACACATATTTGGTATC TG ATC AATTATTGGGACCAAGTGAC CTTTG 
GGGATATGTAAGTGCCCTTGTGAAAGGATGCCGTTGTTTGGAGATTGAC TGC TGGGATGGAGCACAA 
AATG AAC C TGTTG TATATCATGGCTAC AC ACTC ACAAGC AAAC TTCTGT TTAAAAC TGTTATCCAAG 
CTATACACAAGTATGCATTCATGGTGGCTTTAAATTTCCAGACCCCTGGTCTGCCCATGGATCTGCA 
AAATGGGAAATTTTTGGATAATGGTGGTTCTGGATATATTTTGAAACCACATTTCTTA^ 
AAATCAT AC TTT AACCC AAGT AAC ATAAAAG AGGGTATG C CAATT ACACTTAC AAT AAGGCTCATCA 
GTGGT ATCC AG T TGC CTCTTAC TCATTCATC ATC T AAC AAAGGTG ATTC ATTAGTAATTATAGAAGT 
TTTTGGTGTTCCAAATGATCAAATGAAGCAGCAGACTCGTGTAATTAAAAAAAATGCTTTTAGTCCA 
AGATGGAATGAAACATTGACATTTATTATTCATGTCCCAGAA 

AAGGTCAAGGTTTAATAGCAGGAAATGAATTTCTTGGGCAATATACTTTGCCACTTCTATGCATGAA 
CAAAGGTTATCGTCGTATTCCTCTGTTTTCCAGAATGGGTGAGAGCCTTGAGCCTGCTTCACTGTTT 
GTTTATGTTTGGTACGTCAGATAACAGCTAAG 
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SEQE>NO:120 |312 aa * |m\V at 35720.0kD 


NOV26a, 
CG147351-01 
Protein Sequence 


MSKAIAFEIIQKYEPIEEVRKAHQMSLEGFTO^ 

T YL V S DQLLG P SDLWG YV SALVKGC RCLE I IX^WDGAQNEPVVYHG YTLTSKIiLFKTV I Q AI HK Y AFM 
VALNFQTPGL PMDLQNGKFLDNGGSGYILKPHFLRESKS YFNPSNIKEGMP I TLT IRL I SGIQLPLT 
HSSSNKGDSLVI IEVFGVPNDQMKQQTRVIKKNAFSPRWNETFTFI IHVPELALIRFWEGQGLIAG 
NEFLGQYTLPLLCMNKGYRRIPLFSRMGESLEPASLFVYVWYVR 



Further analysis of the NOV26a protein yielded the following properties shown in 
Table 26B. 
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Table 26B. Protein Sequence Properties NOV26a 


PSort analysis: 


0.5844 probability located in raicrobody (peroxisome); 0.1814 probability 
located in lysosome (lumen); 0.1000 probability located in mitochondrial 
matrix space; 0.0000 probability located in endoplasmic reticulum 
(membrane) 


SignalP analysis: 


No Known Signal Sequence Predicted 



A search of the NOV26a protein against the Geneseq database, a proprietary 
15 database that contains sequences published in patents and patent publication, yielded 
several homologous proteins shown in Table 26C. 



Table 26C. Geneseq Results for NOV26a 


Geneseq 
Identifier 


Protein/Organism/Length 
[Patent #, Date] 


NOV26a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for the 
Matched Region 


Expect 
Value 


AAU76817 ; 


Human phospholipase C 
16839 polypeptide - Homo 
sapiens, 608 aa. 
[WO200206302-A2, 
24-JAN-2002] 


134..312 
430..608 


179/179 (100%) 
179/179 (100%) 


e-101 


ABB90425 


Human polypeptide SEQ ID 
NO 2801 - Homo sapiens, 
179 aa. [WO200190304-A2, 
29-NOV-2001] 


134..312 
L.179 


179/179 (100%) 
179/179 (100%) 


e-101 
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AAU87271 i 


Novel central nervous 
system protein #181 - Homo 
sapiens, 254 aa. 
rWO2001553 18-A2, 
02-AUG-2001] 


p, 

134..312 
76..254 


e tv uiiiuiij:.w 

179/179 (100%) 
179/179 (100%) 


' JIJL JljT j 
e-101 


AAM95867 


Human reproductive system 
related antigen SEQ ED NO: 
4525 - Homo sapiens, 254 
aa. [WO200155320-A2, 
02-AUG-2001] 


134..312 
76..2S4 


178/179(99%) 
178/179 (99%) 


e-100 


AAU22938 


Novel human enzyme 
polypeptide #24 - Homo 
sapiens, 254 aa. 
[WO200155301-A2, 
02-AUG-2001] 


134..312 
76..254 


178/179 (99%) 
178/179 (99%) 


e-100 



In a BLAST search of public sequence datbases, the NOV26a protein was found to 
have homology to the proteins shown in the BLASTP data in Table 26D. 
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Table 26D. Public BLASTP Results for NOV26a 


Protein 

Accession 

Number 


Protein/Organism/Length 


NOV26a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for 
the Matched 
Portion 


Expect 
Value 


BAC05152 


CDNA FLJ40406 fis, clone 
TESTI2037534, weakly similar to 
l-PHOSPHATIDYLINOSrrOL-4,5-B 
ISPHOSPHATE 

PHOSPHODIESTERASE DELTA 1 
(EC 3.1.4.11) - Homo sapiens 
(Human), 390 aa. 


134..312 
212..390 


179/179 (100%) 
179/179 (100%) 


e-101 


Q96J70 


Testis-development related NYD-SP27 
- Homo sapiens (Human), 504 aa. 


134..312 
326..504 


178/179 (99%) 
178/179 (99%) 


e-100 


Q95JS0 


Hypothetical 74.4 kDa protein - 
Macaca fascicularis (Crab eating 
macaque) (Cynomolgus monkey), 640 
aa. 


134..312 
462..640 


172/179 (96%) 
177/179 (98%) 


2e-97 


Q95JS1 


Hypothetical 74.6 kDa protein - 
Macaca fascicularis (Crab eating 
macaque) (Cynomolgus monkey), 641 
aa. 


134..312 
463..641 


172/179 (96%) 
177/179(98%) 


2e-97 


AAM95914 


PLC-zeta - Mus musculus (Mouse), 
647 aa. 


134..312 
467..646 


135/181 (74%) 
158/181 (86%) 


7e-73 
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PFam analysis predicts that the NOV26a protein contains the domains shown in the 
Table 26E. 



Table 26E. Domain Analysis of NOV26a 


Pfam Domain 


NOV26a Match Region 


Identities/ 
Similarities 
for the Matched 
Region 


Expect Value 


PI-PLC-X 


52..133 


45/83 (54%) 
66/83 (80%) 


4.3e-36 


Pl-PLC-Y 


134..169 


25/43 (58%) 
33/43 (77%) 


2.9e-17 


C2 


188..276 


33/97 (34%) 
73/97 (75%) 


4.9e-20 



Example 27. 

The NOV27 clone was analyzed, and the nucleotide and encoded polypeptide 
sequences are shown in Table 27A. 



Table 27A. NOV27 Sequence Analysis 




SEQIDNO: 121 


3136 bp 




NOV27a, 
CG147419-01 
DNA Sequence 


AC^GAGTCGTGTCGGCGCCACCCCGGCCCCCGAGCCCGCAGATTGCCCACCGAAGCTCGTGTGTGCA 


CCCCCGATCCCGCCAGCCACTCGCCCCTGGCCTCGCGGGCCGTGTCTCCGGCATCATGTGTGGTATA 


TTTGCTTACTTAAACTACCATGTTCCTCGAACGAGACGAGAAATCCTGGAGACCCTAATCAAAGGCC 
TTCAGAGACTGGAGTACAGAGGATATGATTCTGCTGGTGTGGGATTTGATGGAGGCAATGATAAAGA 
TTGGG AAGC C AATGC CTGC AAAACC C AGCTTATTAAGAAG AAAGGAAAAGTTAAGG C AC TGGATGAA 
GAAGTTCACAAGCAACAAGATATGGATTTGGATATAGAATTTGATGTACACCTTGGAATAGCTCATA 
CCCGTTGGGCAACACATGGAGAACCCAGTCCTGTCAATAGCCACCCCCAGCGCTCTGATAAAAATAA 
TGAATTTATCGTTATTCACAATGGCATCATCACC^CTACAAAGACTTGAAAAAGTTTTTGGAAAGC 
AAAGGC TATGAC T TCGAATC TGAAAC AGAC ACAG AGAC AATTGCCAAGC T CGTT AAGTAT ATGT ATG 
ACAATCGGGAAAGTCAAGATACCAGCTTTACTACCTTGGTGGAGAGAGTTATCCAACAATTGGAAGG 




CCTCTGTTGATTGGTGTACGGAGTGAACATAAACTTTCTACTGATCACATTCCTATACTCTACAGAA 
CAGCTAGGACTCAGATTGGATCAAAATTCACACGGTGGGGATCACAGGGAGAAAGAGGCAAAGACAA 
GAAAGGAAGCTGCAATCTCTCTCGTGTGGACAGCACAACCTGCCTTTTCCCGGTGGAAGAAAAAGCA 
GTGGAGTATTACTTTGCTTCTGATGCAAGTGCTGTCATAGAACACACCAATCGCGTCATCTTTCTGG 
AAGATGATGATGTTGCAGCAGTAGTGGATGGACGTCTTTCTATCCATCGAATTAAACGAACTGCAGG 
AGATCACCCCGGACGAGCTGTGCAAACACTCCAGATGGAACTCCAGCAGATCATGAAGGKSCAACTTC 
AGTTCATTTATGCAGAAGGAAATATTTGAGCAGCCAGAGTCTGTCGTGAACACAATGAGAGGAAGAG 
TCAACTTTGATGACTATACTGTGAATTTGGGTGGTTTGAAGGATCACATAAAGGAGATCCAGAGATG 
CCGGCGTTTGATTCTTATTGCTTGTGGAACAAGTTACCATGCTGGTGTAGCAACACGTCAAGTTCTT 
GAGGAGCTGACTGAGTTGCCTGTGATGGTGGAACTAGCAAGTGACTTCCTGGACAGAAACACACCAG 
TCTTTCGAGATGATGTTTGCTTTTTCCTTAGTCAATCAGGTGAGACAGCAGATACTTTGATGGGTCT 
TCGTTACTGTAAGGAGAGAGGAGC TTTAACTGTGGGGATC AC AAAC ACAGTTGGC AGTTC CATATCA 
CGGGAGACAGATTGTGGAGTTCATATTAATGCTGGTCCTGAGATTGGTGTGGCCAGTACAAAGGCTT 
ATACCAGCC AGTTTG T ATC C CTTGTG ATGTT TG C C C TTATG ATGTG TGATGATC GGATC TC CATGC A 
AGAAAG ACGC AAAGAG ATCATGCTTGGAT TG AAACGGCTGC CTGAT TTGATTAAGGAAGTAC TGAGC 
ATGGATGAC G AAATT C AG AAACTAGC AAC AGAAC TTT ATC ATCAG AAGTC AGTTC TGATAATGGGAC 
GAGGC T ATC ATTATGC T AC TTGTCTTGAAGGGG C AC TG AAAATCAAAG AAATT AC TT ATATGCAC TC 
TGAAGGCATCCTTGCTGGTGAATTGAAACATGGCCCTC 
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ATCATGATCATC ATGAGAGATC ACACTTATGCCAAGTGTCAGAATGC i C rTCAGCAAGTGGTTGCTC 
G^GAGGGGCGGCCTGTGGTAATTTGTGATAAGGAGGATACTGAGACC A i PAAGAACAC AAAAAGAAC 
GATCAAGGTGCCCCACTCAGTGGACTGCTTGCAGGGCAi. rciL.AtjL.vjr r\3»A±T_L.L.i l ial,Avj1 lL?t- lo 
GCTTTCCACCTTGCTGTGCTGAGAGGCTATGATGTTGA1 A ICCL.Av-vjL7/u\l\. l 1Lt1\sA 
CTGTAGAGTGAGGAATATCTATACAAAATGTACGAAACTLiI A1tj»A1 1 ftAvi^ AAt A^ftAbALALL i, X x 


TGTATTTAAAACCTTGATTTAAAATATCACCCCTTGAAGCgTTTTTT 1 AGTAAATCCTTATTTATAT 


ATC AGTTAT AATTATTCC ACTC AATATGTGATTTTTGTG AAGTTACC TC TT AC ATTTTCC C AGTAAT 


T 1 Tr2TftfiAf^^PTTTnaATaATRRAATrTATATTRRAATr ,, I , GTATCAGAAAGATTCTAGCTATTATTT 


TCTTTAAAGAATGCTGGGTGTTGCATTTCTGGACCCTCCACTTCAATCTGAGAAGACAATATGTTTC 


TAAAAATTGGTACTTGTTTCACCATACTTCATTCAGACCAGTGAAAGAGTAGTGCATTTAATTGGAG 


TATCTAAAGCCAGTGGCAGTGTATGCTCATACTTGGACAGTTAGGGAAGGGTTTGCCAAGTTTTAAG 


AGAAGATGTGATTTATTTTGAAATTTGTTTCTGTTTTGTTTTTAAATCAAACTGTAAAACTTAAAAC 


TGAAAAATTTTATTGGTAGGATTTATATCTAAGTTTGGTTAGCCTTAGTTTCTCAGACTTGTTGTCT 


ATTATCTGTAGGTGGAAGAAATTTAGGAAGCGAAATATTACAGTAGTGCATTGGTGGGTCTCAATCC 


TTAACATATTTGCACAATTTTATAGCACAAACTTTAAATTCAAGCTGCTTTGGACAACTGACAATAT 


GATTTTAAATTTGAAGATGGGATGTGTACATGTTGGGTATCCTACTACTTTGTGTTTTCATCTCCTA 


AAAGTGTTTTTTATTTCCTTGTATCTGTAGTCTTTTATTTTTTAAATGACTGCTGAATGACATATTT 


T ATC TTGTTC TTT AAAATC AC AAC ACAG AGCTGC T ATTAAATTAAT ATTG ATAT 




ORF Start: ATG at 123 j |ORF Stop: TGA at 2220 





SEQ ID NO: 122 |699 aa JMW at 78793.6kD 


NOV27a, 
CG147419-01 
Protein Sequence 


MCG I FAYLNYHVPRTRRE I LETL IKGLQRLB YRG YDS AGVGFDGGNDKDWEANACKTQL IKKKGKVK 
ALDEEVHKQ QDMDLDI EFDVHLG IAHTRWATHGEPS PVNSH PQRSDKNNEF I VIHNG 1 1 TNYKDLKK 
FliESKGYBFESETDTETIAKLVKYMYDNRESQDTSFTTLVERVIQQLEGAFALVFKSVHFPGQAVGT 
RRGSPLLIGVRSEHKLSTDHIPILYRTARTQIGSKFTRWGSQGERGKDKKGSCNLSRVDSTTCLFPV 
EEKAVE YYF ASDAS AVI EHTNRVI FLEDDDVAAWDGRLS IHR IKRTAGDHPGRAVQTLQMELQQIM 
KGNFS S FMQKEIFEQPESVVNTMRGRVNFDDYTVNLGGLKDHIKEI QRCRRL I LI ACGTS YHAG VAT 
RQVLEELTELPVMVELASDFLDRNTPVFRDDVCFFLSQ 

SSI SRETDCGVHINAGPEIGVASTKAYTSQFVSLVMFALMMCDDRI SMQERRKE IMLGLKRLPDL IK 
EVL SMDDE IQKLATELYHQKSVL IMGRGYHYATCLEGALKIKE I T YMHSEGI LAGELKHGPLALVDK 
LMPVIMIIMRDHTYAKCQNALQQWARQGRPWICDKEOTETIKNTKRTIKVPHSVDCLQGILSVIP 
LQLLAFHLAVLRGYDVDFPRNIiAKSVTVE 



Further analysis of the NOV27a protein yielded the following properties shown in 
Table 27B. 
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Table 27B. Protein Sequence Properties NOV27a 


PSort analysis: 


0.4902 probability located in mitochondrial inner membrane; 0.4400 
probability located in plasma membrane; 0.3000 probability located in 
microbody (peroxisome); 0.2000 probability located in endoplasmic reticulum 
(membrane) 


SignalP analysis: 


No Known Signal Sequence Predicted 



A search of the NO V27a protein against the Geneseq database, a proprietary 
15 database that contains sequences published in patents and patent publication, yielded 
several homologous proteins shown in Table 27C. 
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Table 27C Gene 


seq Results for NOV27a 








Geneseq 
Identifier 


Protein/Organism/Lengtb 
[Patent #, Date] 


NOV27a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for 
the Matched 
Region 


Expect 
Value 


ABB05747 


Human GFAT1L protein 
SEQ ID NO: 1 -Homo 
sapiens, 699 aa. 
[WO200196574-A1, 
20-DEC-2001] 


1..699 
1..699 


^T\0 !£C\C\ ft\f\Ot\ 

698/699 (99%) 
698/699 (99%) 


0.0 


AAY90260 


Human GFAT protein 
sequence - Homo sapiens, 
681 aa. [WO200037617-A1, 
29-JUN-2000] 


1..699' 
1..681 


681/699 (97%) 
681/699 (97%) 


0.0 


AAR43348 


Human GFAT - Homo 
sapiens, 681 aa. 
[WO9321330-A, 
28-OCT-1993] 


1..699 
1..681 


680/699 (97%) 
680/699 (97%) 


0.0 


AAY90261 


Human GFAT II protein 
sequence - Homo sapiens, 
682 aa. [WO200037617-A1, 
29-JUN-2000] 


L.699 

I..OOZ 


541/701 (77%) 

Olo/ /UI \ol/o) 


0.0 


AAW37772 


Huma 

glutamine:fructose-6-phosph 
ate amidotransferase 
TGC028-4 - Homo sapiens, 
682 aa. [EP824149-A2, 
18-FEB-1998] 


L.699 
L.682 


541/701 (77%) 
618/701 (87%) 


0.0 



5 In a BLAST search of public sequence datbases, the NOV27a protein was found to 

have homology to the proteins shown in the BLASTP data in Table 27D. 



Table 27D. Public BLASTP Results for NOV27a 


Protein 

Accession 

Number 


Protein/Organism/Length 


NOV27a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for 
the Matched 
Portion 


Expect 
Value 


Q99MJ4 


Glutamine: fructose-6-phosphate 
amidotransferase 1 muscle 
isoform GFAT1M - Mus 
muscuhis (Mouse), 697 aa. 


L.699 
L.697 


688/699 (98%) 
690/699 (98%) 


0.0 
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A45055 


glutamine— fructose-6-phosphate 
transaminase (isomerizing) (EC 
2.6.1.16) - human, 681 aa. 


PiB 

1..699 

1..681 


681/699 (97%) 
681/699 (97%) 


■J.1. J/f 
0.0 


Q06210 


Glucosamine--fructose-6-phospha 
te aminotransferase [isomerizing] 
1 (EC 2.6.1.16) (Hexosephosphate 
aminotransferase 1^ 
(D-fiructose-6- phosphate 
amidotransferase 1) (GFAT 1) 
(GFAT1) - Homo sapiens 
(Human), 680 aa. 


2..699 
1..680 


680/698 (97%) 
680/698 (97%) 


0.0 


BAB31882 


Gfptl protein - Mus musculus 
(Mouse), 681 aa. 


1..699 
L.681 


674/699 (96%) 
676/699 (96%) 


0:0 


P47856 


Glucosamine--fructose-6-phospha 
te aminotransferase [isomerizing] 
1 (EC 2.6.1.16) (Hexosephosphate 
aminotransferase 1) 
(D-fructose-6- phosphate 
amidotransferase 1) (GFAT 1) 
(GFAT1) - Mus musculus 
(Mouse), 680 aa. 


2..699 
1..680 


673/698 (96%) 
675/698 (96%) 


0.0 



PFam analysis predicts that the NOV27a protein contains the domains shown in the 
Table 27E. 



Table 27E. Domain Analysis of NOV27a 


Pf am Domain 


NOV27a Match Region 


Identities/ 
Similarities 
for the Matched 
Region 


Expect Value 


GATase_2 


2..210 


91/219(42%) 
202/219(92%) 


4.6e-127 


SIS 


378..512 


52/156(33%) 
118/156(76%) 


2.2e-48 


SIS 


549.-685 


52/156(33%) 
124/156(79%) 


3.3e-46 



Example 28. 

The NOV28 clone was analyzed, and the nucleotide and encoded polypeptide 
sequences are shown in Table 28A. 
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Table 28A. NOV28 Sequence Analysis 



NOV28a, 
CG148102-01 
DNA Sequence 



SEQIDNO: 123 



ACTCTGCCCGACTCAGGGCTCCAGCGTGACA TGGCTGAAGCGCACCAGGCCGTGGGCTTCCGACCCT 



2521 bp 



CGCTGACCTCGGACGGGGCTGAAGTGGAACTCAGTGCCCCTGTGCTGCAGGAGATCTACCTCTCTGG 
CCTGCGCTCCTGGAAAAGGCATCTCTCACGTTTCTGGGTGCAGAATGACTTTCTCACCGGTGTGTTT 
CCTGCCAGCCCCCTCAGTTGGCTTTTCCTCTTCAGTGCCATCCAGCTTGCCTGGTTCCTCCAGCTGG 
ATCCTTCCTTAGGACTGATGGAGAAGATCAAAGAGTTGCTGCGGGGGGTCCTGGCAGCCGCGCTGTT 
TGCCTCGTGTTTGTGGGGAGCCCTGATCTTCACACTGCACGTGGCCCTGAGGCTGCTTCTGTCCTAC 
CACGGCTGGCTTCTTGAGCCCCACGGAGCCATGTCCTCCCCCACCAAGACCTGGCTGGCCCTGGTCC 
GCATCTTCTCTGGCCGCCACCCGATGCTGTTCAGTTACCAGCGCTCCCTGCCACGCCAGCCCGTGCC 
CTCTGTGCAGGACACCGTGCGCAAGTACCTGGAGTCGGTCCGGCCCATCCTCTCCGACGAGGACTTC 
GACTGGACCGCGGTCCTGGCGCAGGAATTCCTGAGGCTGCAGGCGTCGCTGCTGCAGTGGTACCTGC 
GGCTCAAGTCCTGGTGGGCGTCCAATTATGTGAGTGACTGGTGGGAGGAATTTGTGTACCTGCGCTC 
CCGAAATCCGCTGATGGTGAACAGCAACTATTACATGATGGACTTCCTGTATGTCACACCCACGCCT 
CTGCAGGCAGCTCGCGCTGGGAATGCCGTCCATGCCCTCCTCCTGTACCGCCACCGCCTGAACCGCC 
AGGAGATACCCCCGGTGAGAC TGATGGGAATGCGCCCCTTATGCTC TGCCCAGTACGAGAAGATCTT 
C AACACCACGCGGATTCC AGGGGTC CAAAAAGGTGAGACC ATCCGCC ACC TCCATG AC AGCCAAC AC 
GTGGCTGTCTTCCACCGGGGCCGATTCTTCCGCATGGGGACCCACTCCCGAAACAGCCTGCTTTCCC 
CGAG AGCCC TGGAGC AGCAGTTTC AGAG AATCC TGGATG ATCCCTC ACCGGCCTGC CC CC ACGAGGA 
ACATCTGGCAGCTCTGACAGCTGCTCCCAGGGGCACGTGGGCCCAGGTGCGGACATCCCTGAAGACC 
CAGGCAGCGGAGGCCCTGGAGGCGGTGGAAGGGGCCGCTTTCTTTGTGTCACTGGATGCTGAGCCCG 
CGGGGCTCACCAGGGAGGACCCGGCAGCGTCGTTGGATGCCTACGCCCATGCTCTGCTGGCCGGCCG 
GGGCCATGATCGGTGGTTTGACAAATCCTTCACCCTAATCGTCTTCTCTAACGGGAAGCTGGGCCTC 
AGCGTGGAGCACTCCTGGGCCGACTGCCCCATCTCAGGACACATGTGGGAGTTCACTCTGGCTACAG 
AATGCTTTCAGCTGGGCTACTCAACAGACGGCCACTGCAAGGGGCACCCGGACCCCACACTACCCCA 
GCCCCAGCGGCTGCAATGGGACCTTCCAGACCAGGTGAGGCTGGGTATCTCTCTAGCCCTGAGGGGA 
GCCAAGATCTTGTCTGAAAATGTCGACTGCCATGTCGTTCCATTCTCCCTATTTGGCAAGAGCTTCA 
TCCGACGCTGCCACCTCTCTTCAGACAGCTTCATCCAGATCGCCTTGCAACTGGCCCACTTCCGGGA 
CCCACAGTGCCTCGCCCTGTTCCGCGTGGCAGTGGACAAGCACCAGGCTCTGCTGAAGGCAGCCATG 
AGCGGGCZAGGGAGTTGACCGCCACCTGTTTGCGCTGTACATCGTGTCCCGATTCCTCCACCTGCAGT 
CGCCCTTCC TGACCC AGGTCC ATTCGGAGCAGTGGCAGCTGTCC ACC AGCCAGATCCC TGTTCAGC A 
AATGCATCTGTTTGACGTCCACAATTACCCGGACTATGTTTCCTCAGGCGGTGGATTCGGGCCTGCT 
GATGACCATGGTTATGGTGTTTCTTATATCTTCATGGGGGATGGCATGATCACCTTCCACATCTCCA 
GCAAAAAATCAAGCACAAAAACGGATTCCCACAGGCTGGGGCAGCACATTGAGGACGCACTGCTGGA 
TGTGGCCTCCCTGTTCCAGGCGGGACAGCATTTTAAGCGCCGGTTCAGAGGGTCAGGGAAGGAGAAC 
TCCAGGCACAGGTGTGGATTTCTCTCCCGCCAGACTGGGGCCTCCAAGGCCTCAATGACATCCACCG 
ACTTCTG ACTCCTTCCAGCAGGCAGCTGGCCTCTCCAAGGAATAAGGGTGAAATTGCCACAGCTGGC 



TGACACAGGACAGGGGCAACTGGTTTGGCAACCCCACATCCAGGCCAATAAAGATGTGTGAGCTGGG 



TGTGTGGTGTCTGCTATGCTCTTGGGCAGGGCAGGGGTAGAAGAGGTAAGGACCAGGGTGGAGGAGG 



ACAGAj^GCTCCCATCCATTCCCAGGCCCAGCCAGGGATTCCC 



ORF Start: ATG at 31 



lORF Stop: TGA at 2284 





SEQ ID NO: 124 |751 aa |MW at 84918.2kD 


NOV28a, 
CG148102-01 
Protein Sequence 


MAEAHQ AVGFR P S LT S DG AEVEL S A P VLQ E I YL SGLRSWKRHL S RFWVQNDFLTG VF PAS PLSWL FL 
FSAIQLAWFLQLDPSLGLMEK IKELLRG VLAAALFASCLWGAL I FTLHVALRLLLS YHGWLLEPHGA 
MS S PTKTWLALVR I F SGRHPMLF SY QRSL PRQPVPSVQDTVRKYLES VRPI LSDEDFDWTAVLAQEF 
IJU,QASLLQWYLRLKSVWASNYVSI>WWEEF 

HALLLYRHRI^QEIPPVRmGMRPLCSAQYEKIFimillPGVQKGETIRHLHDSQHVAVFHRGRFF 
RMGTHSRNSLLS PRALEQQFQRILDDPS PAC PHEEHLAALTAAPRGTWAQVRTSLKTQAAEALEAVE 
GAAFFVSLDAEPAGLTREDPAASLDAYAHALLAGRGHDRWFDKSFTLIVFSNGKLGLSVEHSWADCP 
I SGHMWEFTIiATECFQLGYSTDGHCKGH PDPTL PQPQRLQWDL PDQVRLGI SLALRGAKIL SENVDC 
HWPFSLFGKSFIRRCHLSSDSFIQIALQLAHFRDPQ^LALF^ 

ALYI VSRFLHLQS PFLTQVHSEQWQLSTSQI PVQQMHLFDVHNYPDYVSSGGGFGPADDHGYGVSYI 
FMGDGMI TFHI S SKKS STKTDSHRLGQH I EDALLDVASLFQAGQHFKRRFRGSGKENSRHRCGFL SR 
QTGASKASMTSTDF 



ISEQ ID NO: 125 



|2748bp 
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NOV28b, 
CG148102-02 
DNA Sequence 



CGAGAGACAGGAATCGGGGTTTCTGGGTGACGGTGJ 



CGACCCGGTGGTGGACTCCTTGCACTGGGATTGGACATATGCAAGCGGGAGATTTGGGGCCGGCGCT 



CAAAATCGGGGGGCGGGGGTGGACTCGGGTTTGGACCCCAGGATCCGATCAGCGGACCCTTGATTCA 



ACGTGGGCTCCAGCGTGACA TGGCTGAAGCGCACCAGGCCGTGGGCTTCCGACCCTCGCTGACCTCG 



GACGGGGCTGAAGTGGAACTCAGTGCCCCTGTGCTGCAGGAGATCTACCTCTCTGGCCTGCGCTCCT 
GGAAAAGGCATCTCTCACGTTTCTGGAATGACTTTCTCACCGGTGTGTTTCCTGCCAGCCCCCTCAG 
TTGGCTTTTCCTCTTCAGTGCCATCCAGCTTGCCTGGTTCCTCCAGCTGGATCCTTCCTTAGGACTG 
ATGGAGAAGATCAAAGAGTTGCTGCCTGACTGGGGTGGACAACACCACGGGCTCCGGGGGGTCCTGG 
CAGCCGCGCTGTTTGCCTCGTGTTTGTGGGGAGCCCTGATCTTCACACTGCACGTGGCCCTGAGGCT 
GCTTCTGTCCTACCACGGCTGGCTTCTTGAGCCCCACGGAGCCATGTCCTCCCCCACCAAGACCTGG 
CTGGCCCTGGTCCGCATCTTCTCTGGCCGCCACCCGATGCTGTTCAGTTACCAGCGCTCCCTGCCAC 
GCCAGCCCGTGCCCTCTGTGCAGGACACCGTGCGCAAGTACCTGGAGTCGGTCCGGCCCATCCTCTC 
CGACGAGGACTTCGACTGGACCGCGGTCCTGGCGCAGGAATTCCTGAGGCTGCAGGCGTCACTGCTG 
CAGTGGTACCTGCGGCTCAAGTCCTGGTGGGCGTCCAATTATGTCAGTGACTGGTGGGAGGAATTTG 
TGTACC TGCGCTCCCG AAATCCGC TGATGGTG AAC AGCAACTATTAC ATGATGGAC TTCC TGTATGT 
CACACCCACGCCTCTGCAGGCAGCTCGCGCTGGGAATGCCGTCCATGCCCTCCTCCTGTACCGCCAC 
CGCCTGAACCGCCAGGAGATACCCCCGACTTTGCTGATGGGAATGCGCCCCTTATGCTCTGCCCAGT 
ACGAGAAGATCTTCAACACCACGCGGATTCCAGGGGTCCAAAAAGACTACATCCGCCACCTCCATGA 
CAGCCAACACGTGGCTGTCTTCCACCGGGGCCGATTCTTCCGCATGGGGACCCACTCCCGAAACAGC 
CTGCTTTCCCCGAGAGCCCTGGAGCAGCAGTTTCAGAGAATCCTGGATGATCCCTCACCGGCCTGCC 
CCC ACGAGGAAC ATC TGGC AGC TC TG ACAGC TGCTCCCAGGGGC ACGTGGGCC CAGGTGCGGAC ATC 
CCTGAAGACCCAGGCAGCGGAGGCCCTGGAGGCGGTGGAAGGGGCCGCTTTCTTTGTGTCACTGGAT 
GCTG AGCC C GCGGGGCTC ACCAGGGAGGACCCGGC AGCGTCGTTGGATGCCTACGCCC ATGC TCTGC 
TGGC TGGC CGGGGCCATGATCGCTGGTTTG AC AAATCCTTCACCCTAATCGTCTTCTCTAACGGGAA 
GCTGGGCCTCAGCGTGGAGCACTCCTGGGCCGACTGCCCCATCTCAGGACACATGTGGGAGTTCACT 
CTGGCTACAGAATGCTTTCAGCTGGGCTACTCAACAGATGGCCACTGCAAGGGGCACCCGGACCCCA 
CACTACCCCAGCCCCAGCGGCTGCAATGGGACCTTCCAGACCAGATCCACTCCTCCATCTCTCTAGC 
CCTGAGGGGAGCCAAGATCTTGTCTGAAAATGTCGACTGCCATGTCGTTCCATTCTCCCTATTTGGC 
AAGAGC TTC ATCCGACGCTGCCACCTCTC TTC AGACAGCTTCATCC AGATCGCCTTGC AACTGGCCC 
ACTTCCGGGACAGGGGTCAATTCTGCCTGACTTATGAGTCGGCCATGACTCGCTTATTCCTGGAAGG 
CCGGACGGAGACGGTGCGGTCTTGCACGAGGGAGGCCTGCAACTTTGTCAGGGCCATGGAGGACAAA 
GAGAAGACGGACCCACAGTGCCTCGCCCTGTTCCGCGTGGCAGTGGACAAGCACCAGGCTCTGCTGA 
AGGCAGCCATGAGCGGGCAGGGAGTTGACCGCCACCTGTTTGCGCTGTACATCGTGTCCCGATTCCT 
CCACCTGCAGTCGCCCTTCCTGACCCAGGTCCATTCGGAGCAGTGGCAGCTGTCCACCAGCCAGATC 
CCTGTTCAGCAAATGCATCTGTTTGACGTCCACAATTACCCGGACTATGTTTCCTCAGGCGGTGGAT 
TCGGGCCTGCTGATGACCATGGTTATGGTGTTTCTTATATCTTCATGGGGGATGGCATGATCACCTT 
CCACATCTCCAGCAAAAAATCAAGCACAAAAACGGATTCCCACAGGCTGGGGCAGCACATTGAGGAC 
GCACTGCTGGATGTGGCCTCCCTGTTCCAGGCGGGACAGCATTTTAAGCGCCGGTTCAGAGGGTCAG 
GGAAGGAGAACTCCAGGCACAGGTGTGGATTTCTCTCCCGCCAGACTGGGGCCTCCAAGGCCTCAAT 
G AC ATCC ACCGACTTCTG ACTCC TTCCAGC AGGCAGCTGGCCTC TCCAAGGAATAAGGGTGAAATTG 



CC ACAGCTGGC TGACAC AGGAC AGGGGC AAC TGGTTTGGCAACCCCACATCCAGGCAAATAAAGATG 



ORF Start: ATG at 221 



jQRF Stop: TGA at 2630 



lMWat90987.8kP 



SEQ TP NO: 126 



803 aa 



NOV28b, 
CG148102-02 
Protein Sequence 



MAEAHQAVGFRPSLTSDGAEVELS APVLQE I YL SGLR SWKRHL SRFWNDFLTGVF PAS PLSWLFLF S 
AI QLAWFLQLDPSLGLMEK IKELL PDWGGQHHGLRGVLAAALFASC LWGALI FTLHVALRLLLS YHG 
WLLEPHGAMS S PTKTWLALVRI FSGRHPMLFSYQRSL PRQ PVPSVQDTVRKYLES VRP ILSDEDFDW 
TAVLAQEFLRLQASLLQWYLRLKSWWASNYVSDWWEEF^^^ 

AARAGNAVHAL LL YRHRIiNRQE I P PTLLMGMRPLC S AQYEK I FNTTR I PGVQKDYIRHLHDSQHVAV 
FHRGRFFRMGTHSRNSLLS PRALEQQFQR I LDDPS PAC PHEEHLAAXtTAAPRGTWAQVRTSLKTQAA 
EALEAVEGAAFFVSLDAEPAGLTREDPAASLDAYAHALLAGRGHDRWFDKSFTLIWSNGKLGLSVE 
HSWADCPISGHMWEFTLATECFQLGYST1X3HCKGHPDPTLPQPQR 

LSENVDCHWPFSLFGKSF IRRCHLS SDS F I QI ALQLAHFRDRGQFCLTYES AMTRLFLEGRTETVR 
SCTREACNFVRAMEDKEKTDPQCLALFRVAVDKHQ 

LTQVHSEQWQLSTSQIPVQQMHLFDVHNYPDYVSSGGGFGPADDHGYGVSYIFMGDGMITFHISSKK 
SSTKTDSHRLGQHIEDALLDVA5LFQAGQHFKRRFRGSGKEN5RHRCGFLSRQTGASKASMTSTDF 



Sequence comparison of the above protein sequences yields the following sequence 
relationships shown in Table 28B. 
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Table 28B. Comparison of NOV28a against NOV28b. 


Protein Sequence 


NOV28a Residues/ 
Match Residues 


Identities/ 

Similarities for the Matched Region 


NOV28b 


1..751 
1..803 


717/806(88%) 
719/806(88%) 



5 Further analysis of the NOV28a protein yielded the following properties shown in 

Table 28C. 



Table 28C. Protein Sequence Properties NOV28a 


PSort analysis: 


0.7900 probability located in plasma membrane; 0.6400 probability located in 
microbody (peroxisome); 0.3000 probability located in Golgi body; 0.2000 
probability located in endoplasmic reticulum (membrane) 


SignalP analysis: 


Cleavage site between residues 5 and 6 



10 

A search of the NOV28a protein against the Geneseq database, a proprietary 
database that contains sequences published in patents and patent publication, yielded 
several homologous proteins shown in Table 28D. 
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Table 28D. Geneseq Results for NOV28a 


Geneseq 
Identifier 


Protein/Organism/Length 
[Patent #, Date] 


NOV28a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for 
the Matched 
Region 


Expect 
Value 


AAY79220 


Human transferase 
TRNSFS-12 - Homo sapiens, 
803 aa. [WO200014251-A2, 
16-MAR-2000] 


1..751 
1..803 


740/806 (91%) 
742/806 (91%) 


0.0 


AAE10322 


Human carnitine 
acyltransferase, 26886 - 
Homo sapiens, 803 aa. 
[WO200166759-A2, 
13-SEP-2001] 


1..751 
1..803 


739/806 (91%) 
742/806(91%) 


0.0 
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5 



AAW14438 


Type I carnitine palmitoyl 
transferase-like protein - 
Homo sapiens, 772 aa. 
[JP09009969-A, 
14-JAN-1997] 


I F- 

1-711 

1..766 


375/770(48%) 
495/770 (63%) 


0.0 


ABG04960 


Novel human diagnostic 
protein #4951 - Homo 
sapiens, 521 aa. 
[WO200175067-A2, 
ll-OCT-20011 


224..571 
92..471 


337/381 (88%) 
339/381 (88%) 


0.0 


ABB67527 


Drosophila melanogaster 
polypeptide SEQ ID NO 
29373 - Drosophila 
melanogaster, 780 aa. 
[WO200171042-A2, 
27-SEP-2001] 


1..717 
1..765 


315/775 (40%) 
447/775 (57%) 


e-161 


In a BLAST search of public sequence datbases, the NOV28a protein was found to 
have homology to the proteins shown in the BLASTP data in Table 28E. 


Table 28E. Public BLASTP Results for NOV28a 


Protein 

Accession 

Number 


Protein/Organism/Length 


NOV28a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for 
the Matched 
Portion 


Expect 
Value 


Q8TCG5 


Carnitine 

palmitoyltransferase IC - 
Homo sapiens (Human), 803 
aa. 


1..751 
1..803 


740/806(91%) 
742/806 (91%) 


0.0 


CAC88591 


Sequence 1 from Patent 
WO0166759 - Homo sapiens 
(Human), 803 aa. 


1..751 
1..803 


739/806 (91%) 
742/806 (91%) 


0.0 


AAH29104 


Similar to carnitine 
palmitoyltransferase IC - 
Homo sapiens (Human), 792 
aa. 


1..751 i 
1..792 


729/806 (90%) 
731/806(90%) 


0.0 


P32198 


Carnitine 

O-palmitoyltransferase I, 
mitochondrial liver isoform 
(EC 2.3.1.21) (CPT I) 
(CPTI-L) - Rattus norvegicus 
(Rat), 773 aa. 


1..710 
1..765 


394/768 (51%) 
524/768 (67%) \ 


0.0 



204 



WO 03/029424 



PCTAJS02/31373 



Q9BWK0 


Similar to carnitine 
palmitoyltransferase I, liver - 
Homo sapiens (Human), 756 
aa. 


1..690 
1..745 


L> B /"OlUUiU^ 
381/748 (50%) 

510/748 (67%) 


0.0 



PFam analysis predicts that the NOV28a protein contains the domains shown in the 
Table 28F. 



Table 28F. Domain Analysis of NOV28a 


Pfam Domain 


NOV28a Match 
Region 


Identities/ 
Similarities 

for the Matched Region 


Expect Value 


Carn_acyltransf 


162..708 


208/680(31%) 
437/680 (64%) 


1.5e-167 



Example 29. 

The NOV29 clone was analyzed, and the nucleotide and encoded polypeptide 
sequences are shown in Table 29A. 



Table 29A. NOV29 Sequence Analysis 




SEQK>NO:127 |l776bp | 


NOV29a, 
CG148431-01 
DNA Sequence 


ACTAAAGCCTGC AGAGACCTCTGAAGG AAAACCTGTCCCGGGCTC TGTCACTTCACACCCATGGC TA 


ACCCTGGAGGTGGTGCTGTTTGCAACGGGAAACTTCACAATCACAAGAAACAGAGCAATGGCTCACA 

AAGCAGAAACTGCACAAAGAATGGAATAGTGAAGGAAGCCCAGCLAAAATGGGAAGCCACATTTTTAT 

GATAAGCTCATTGTTGAATCGTTTGAGGAAGCACCCCTTCATGTTATGGTTTTCACTTACATGGGAT 

ATG GAATTGG AAC CC TGTTTGGCT ATCTC AGAG ACT TTTTAAG AAAC TGG GGAAT AG AAAAATGC AA 

CGCAGCTCTGGAAAGAAAAGAACAAAAAGATTTTGTGCCACTGTATCAAGACTTTGAAAAT 

ACAAGAAACCTTTACATGCGAATCAGAGACAACTGGAACCGGCCCATCTGCAGTGCCCCAGGGCCTC 

TCTTTGATTTGATGGAGAGGGTATCAGACGACTATAACTGGACGTTTAGGTTTACTGGAAGAGTC 

GAAAGATGTCATCAACATGGGCTCCTATAACTTCCTTGGT^ 

AGGACAATAAAGGATGTTTTAGAGGTGTATGGCACAGGCGTGGCCAGCACCAGGCATGAAATGGGCA 
CCTTGGATAAGCACAAGGAGTTGGAGGACCTTGTGGCTAAGTTCCTGAATGTGGAAGCAGCTATGGT 
CTTTGGGATGGGATTCGCAACTAACTCAATGAATATCCCAGCATTAGTTGGAAAGGGATGCCTCATT 
TT AAGTG ATGAGTTAAACC AC ACATCGCTTG TGC TTGGGGC CC GAC TC TC AGGTGC AAC CAT AAG AA 
TC TTC AAACAC AACAAC AC ACAAAGCC TAGAGAAGCTCCTGAGAGATGCTGTC ATCTATGGC CAGCC 
TCGAACCCGCAGAGCTTGGAAAAAGATTCTCATCCTGGTGGAGGGTOTCTACAGCATGGAAGGTTCC 
ATCGTGCATCTGCCCCAGATCATAGCTCTAAAGAAGAAATACAAGGCTTACCTCTACATAGATGAAG 
CTCACAGTATTGGGGCCGTGGGCCCAACCGGCCGGGGTGTCACGGAGTTCTTTGGACTAGACCCTCA 
TGAAGTTGATGTGCTCATGGGCACATTCACCAAAAGTTTTGGAGCTTCAGGAGGTTACATAGCTGGA 
AGGAAGGACCTCGTGGATTATTTACGGGTTCACTCGCATAGTGCTGTTTATGCTTCATCCATGAGCC 
CACCGATAGCAGAGCAAATCATCAGATCACTAAAACTTATCATGGGACTGGATGGGACCACTCAAGG 
GCTGCAGAGAGTACAGCAACTTGCGAAAAACACAAGATACTTCAGACAAAGACTGCAGGAAATGGGA 
TTCATTATCTATGGCMTGAGAATGCTTCTGTTGTT^ 

CGGCTTTTGCAAGGCATATGCTAGAGAAAAAAATTGGAGTGGTGGTCGTGGGATTTCCA 

TT AG AAGC TC TTGATGAAATGGGTGATCTC TTGC AACTGAAAT AT TCC CGGCACAAGAAGTC AGC AC 

RTCCTGAGCTCTATGATGAGACGAGCTTTGAACTCGAAGATTAAGTTTCCTGGTCCTGAATGACAC 

TAAAGACTTTGCGAGAAAGACCTCCCTCCTTGCC 




ORF Start: ATG at 61 |ORF Stop: TAA at 1717 
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SEQ ID NO: 128 |552aa 


MWat62048.9kD 


NOV29a, 
CG148431-01 
Protein Sequence 


MANPGGGAVCNGKLHNHKKQSNGSQ SRNCTKNGIVKEAQQNGK PHF YDKLIVESFEEAPLHVMVFT Y 
MGYG IGTLFG YLRDFLRNWG I EKCNAAVERKEQKDFVPL YQDFENF YTRNL YMR IRDNWNRP I C SAP 
GPL FDLMERVSDD YNWTFRFTGRVIKDVINMG S YNFLGLAAKYDBSMRT IKDVLEVYGTGVASTRHE 
MGTLDKHKELEDLVAKFLNVEAAMVFGMGFATNSMNIPA^ 

IRI FKHNNTQ SLEKLLRDAVT YGQPRTRRAWKK IL IL VEGVYSMEGS XVHL PQ 1 1 ALKKKYKAYL YI 
DEAHS I GAVG PTGRGVTEFFGLDPHEVDVLMGTFTKSFG ASGGY I AGRKDLVDYLRVHSHSAVYAS s 
MSPPI AEQI IRS LKL IMGLDGTTQGLQRVQQL AKNTR YFRQRLQEMG F 1 1 YGNENAS WPLLLYMPG 
KVAAFARHML EKKIG VWVGF PATPLAEARARFCVSAAHTREMLDTVLEALDEMGDLLQL 
SARPELYDETSFELED 





SEQ ID NO: 129 |l492bp 




NOV29b, 
CG148431-02 
DNA Sequence 


CAC CGGATCC ACCATGGCTAACCCTGGAGGTGGTGCTGTTTGC AACGGGAAAC TTCACAATCACAAG 
AAACAGAGCAATGGCTCACAAAGCAGAAACTGCACAAAGAATGGAATAGTGAAGGAAGCCCAGGATT 
TTGTGCCACTGTATCAAGACTTTGAAAATTTTTATACAAGAAACCTTTACATGCGAATCAGAGACAA 
CTGG AACCGGCC C ATCTGC AGTGCCCC AGGGCCTC TGTTTGATGTGATGGAGAGGGTATCGGACGAC 
TATAACTGGACGTTTAGGTTTACTGGAAGAGTCATCAAAGATGTCATCAACATGGGCTCCTATAACT 
TCCTTGGTCTTGCAGCCAAGTATGATGAGTCTATGAGGACAATAAAGGATGTTTTAGAGGTGTATGG 
C AC AGGCGTGGC C AGCACC AGGC ATGAAATGGGCACC TTGGATAAGC AC AAGGAGTTGGAGGACCTT 
GTGGCTAAGTTC C TGAATGTGGAAGC AGC TATGGTCTTTGGGATGGGATTCGCAACTAACTC AATGA 
ATATCCCAGCATTAGTTGGAAAGGGATGCCTCATTTTAAGTGATGAGTTAAACCACACATCGCTTGT 
GCTTGGGGCCCGACTCTCAGGTGCAACCATAAGAATCTTCAAACACAACAACACACAAAGCCTAGAG 
AAGCTCCTGAGAGATGCTGTCATC TATGGCCAGCC TCGAACCCGCAGAGCTTGGAAAAAGATTCTC A 
TCCTGGTGK3AGGGTGTCTACAGCATGGAAGGTTCCATCGTGCATCTGCCCCAGATCATAGCTCTAAA 
GAAGAAATACAAGGCTTACCTCTACATAGATGAAGCTCACAGTATTGGGGCCGTGGGCCCAACCGGC 
CGGGGTGTCACGGAGTTCTTTGGACTAGACCCTCATGAAGTTGATGTGCTCATGGGCACATTCACCA 
AAAGTTTTGGAGCTTCAGGAGGTTACATAGCTGGAAGGAAGGACCTCGTGGATTATTTACGGGTTCA 
CTCGCATAGTGCTGTTTATGCTTCATCCATGAGCCCACCGATAGCAGAGCAAATCATCAGATCACTA 
AAACTTATCATGGGACTGGATGGGACCACTCAAGGGCTGCAGAGAGTACAGCAACTTGCGAAAAACA 
CAAGATACTTCAGACAAAGACTGCAGGAAATGGGATTCATTATCTATGGCAATGAGAATGCTTCTGT 
TGTTCC TCTGCTTCTTTATATGCC TGGTAAAGTAGCGGC TTTTGC AAGGCATATGC TAGAGAAAAAA 
ATTGGAGTGGTGGTCGTGGGATTTCCAGCCACTCCCCTCGCAGAAGCTCGGGCTCGGTTTTGTGTTT 
CAGCGGCACATACCCGGGAGATGTTAGACACGGTTTTAGAAGCTCTTGATGAAATGGGTGATCTCTT 
GCAACTGAAATATTCCCGGCACAAGAAGTCAGCACGTCCTGAGCTCTATGATGAGACGAGCTTTGAA 
CTCGAAGATCTCGAGGGC 




ORF Start: ATG at 14 


jORFStop: at 1484 





SEQ ID NO: 130 |490aa 


MWat54766.5kD 


NOV29b, 
CG148431-02 
Protein Sequence 


MANPGGGAVCNGKiHNHKXQSNGSQSRNCTKNGI 

I C S APGPLFDVMERVSDDYNWTFRFTGRV IKDVINMGS YNFI^LAAKYDESMRTIKDVLEVYGTGV 
STRHEMGTLDKHKELEDLVAKFLNVEAAMVFGMGFATNSMNIPALVGKGCLILSDELNHTSLVLGAR 
LSGAT IRIFKHNNTQSLEKLLRDAVI YGQPRTRRAWKK I L ILVEGVYSMEG S I VHL PQI I ALKKKYK 
AYLYIDEAHSIGAVGPTGRGVTEFFGLDPHEVDVLMGTFTKSFGASGGYIAGRKDLVDYLRVHSHSA 
VYAS SMS P P IAEQI IRSLKL IMGLDGTTQGLQRVQQLAKNTR YFRQRLQEMGFI I YGNENAS WPLL 
L YMPGKVAAF ARHML EKK I GVVWGF PAT PLAEARARFCV S AAHTREMLDTVL EALDEMGDLLQLKY 
SRHKKSARPELYDETSFELED 



Sequence comparison of the above protein sequences yields the following sequence 
relationships shown in Table 29B. 
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Table 29B. Comparison of NOV29a against NOV29b. 


Protein Sequence 


NOV29a Residues/ 
Match Residues 


Identities/ 

Similarities for the Matched Region 


NOV29b 


98..5S2 
36..490 


438/455 (96%) 
440/455 (96%) 



5 Further analysis of the NOV29a protein yielded the following properties shown in 

Table 29C. 



Table 29C. Protein Sequence Properties NOV29a 


PSort analysis: 


0.4761 probability located in microbody (peroxisome); 0.3000 probability 
located in nucleus; 0.2077 probability located in lysosome (lumen); 0.1000 
probability located in mitochondrial matrix space 


SignalP analysis: 


No Known Signal Sequence Predicted 



10 

A search of the NOV29a protein against the Geneseq database, a proprietary 
database that contains sequences published in patents and patent publication, yielded 
several homologous proteins shown in Table 29D. 

15 



Table 29D. Geneseq Results for NOV29a 


Geneseq 
Identifier 


Protein/Organism/Length 
[Patent #, Date] 


NOV29a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for 
the Matched 
Region 


Expect 
Value 


AAE22153 


Human TRNFR-15 protein - 
Homo sapiens, 532 aa. 
[WO200226950-A2, 
04-APR-2002] 


1..552 
1..552 


551/552 (99%) 
552/552 (99%) 


0.0 


AAG73598 


Human colon cancer antigen 
protein SEQ ID NO:4362 - 
Homo sapiens, 391 aa. 
[WO200122920-A2, 
05-APR-2001] 


20L.549 
38.387 


269/352 (76%) 
316/352 (89%) 


e-158 


ABB60160 


Drosophila melanogaster 
polypeptide SEQ ID NO 
7272 - Drosophila 
jnelanoy aster. 597 »*. 


54..S43 
114..597 


256/491 (52%) 
350/491 (71%) 


e-151 
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[WO200171042-A2, 


p 


L7 ir'; j u^iuil:.. 


JL J.% 1/ * 


AAE21820 


Human serine 
palmitoyltransferase 
(SPT)-like enzyme #2 - 
Homo sapiens, 230 aa. 

28-MAR-2002] 


47..276 
1..230 


228/230 (99%) 
230/230 (99%) 


e-133 


AAY32003 . 


Rice serine 

palmitoyltransferase Lcb2 
subunit - Oryza sativa, 489 
aa. [WO9949021-A1, 
30-SEP-1999] 


59..541 
5..483 


237/485 (48%) 
333/485 (67%) 


e-133 


In a BLAST search of public sequence datbases, the NOV29a protein was found to 
have homology to the proteins shown in the BLASTP data in Table 29E. 


Table 29E. Public BLASTP Results for NOV29a 


Protein 

Accession 

Number 


Protein/Organism/Length 


NOV29a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for the 
Matched Portion 


Expect 
Value 


Q9UGB6 


DJ718P11. 1.1 (Novel class II 
aminotransferase similar to 
serine palmotyltransferase 
(Isoform 1)) - Homo sapiens 
(Human), 414 aa (fragment). 


102..515 
L.414 


414/414 (100%) 
414/414(100%) 


0.0 


015270 


Serine palmitoyltransferase 2 
(EC 2.3.1.50) (Long chain 
base biosynthesis protein 2) 
(LCB2) 

(Serine-palmitoyl-CoA 
transferase 2) (SPT 2)- 
Homo sapiens (Human), 562 
aa. 


7..549 
18..558 


383/546 (70%) 
449/546 (82%) 


0.0 


P97363 


Serine palmitoyltransferase 2 
(EC 2.3.1.50) (Long chain 
base biosynthesis protein 2) 
(LCB 2) 

(Serine-palmitoyl-CoA 
transferase 2) (SPT 2) - Mus 
musculus (Mouse), 560 aa. 


7..S49 
18..556 


379/546 (69%) 
449/546 (81%) 


0.0 


JC5180 


serine C-palmitoyltransferase 
(EC 2.3.1.50) Lcb2 chain - 
mouse, 560 aa. 


7..549 
18..556 


378/546 (69%) 
449/546 (82%) 


0.0 
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054694 


Serine palmitoyltransferase 2 
(EC 2.3.1.50) (Long chain 
base biosynthesis protein 2) 
(LCB 2) 

(Serine-palmitoyl-CoA 
transferase 2) (SPT 2)- 
Cricetulus griseus (Chinese 
hamster), 560 aa. 


F 

7..549 
18..556 


446/546 (81%) 


0.0 



PFam analysis predicts that the NOV29a protein contains the domains shown in the 
Table 29F. 

5 



Table 29f. Domain Analysis of NOV29a 


Pfam Domain 


NOV29a Match Region 


Identities/ 
Similarities 
for the Matched 
Region 


Expect Value 


aminotran_l_2 


193..521 


71/363 (20%) 
237/363 (65%) 


2.6e-29 



Example 30. 

10 The NOV30 clone was analyzed, and the nucleotide and encoded polypeptide 

sequences are shown in Table 30A. 



Table 30A. NOV30 Sequence Analysis 




SEQ ID NO: 131 


576 bp 1 


NOV30a, 
CG148888-01 
DNA Sequence 


TGAGCCAGCCCCGGATOACCCTGCGACCTGGAACAATGCGGCTGGCCTGCATGTTCTCTTCCATCCT 

GCTGTTCGGAGCTGCAGGCCTCCTCCTCTTCATCAGCCTGCAGGACCCTACGGAGCTCGCCCCCCAG 

CAGGTGCCAGGAATAAAGTTCAACATCAGGCCAAGGCAGCCCCACCACGACCTCCCACCAGGCGGCT 

CTGGGGTGCGTTTTCCCGAGTTCGTCCAGTACCTGCTGGACGTGCACCGGCCCGTGGGGATGGACAT 

TCACTGGGACCATGTCAGCCGGCTCTGCAGCCCCTGCCTCATCGACTACGATTTCGTAGGCAAGTTC 

GAGAGCATGGAGGACGATGCCAACTTCTTCCTGAGCCTCATCCGCGCGCCGCGGAACCTGACCTTCC 

CCCGGTTCAAGGACCGGCACTCGCAGGAGGCGCGGACCACAGCGAGGATCGCCCACCAGTACTTCGC 

CCAACTCTCGGCCCTGGAAAGGCAGCGCACCTACGACT 

TATTC CAAGCCC TTTACAGATCTGTAC TGAGGGGC GCCGC 




ORF Start: ATG at 15 


| |ORFStop:TGAat564 



15 
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SEQ ID NO: 132 183 aa |MW at 21347.3kD 


NOV30a, 
CG148888-01 
Protein Sequence 


MTLRPGTMRLACMFSSILLFGAAGLLLFISLQDPTELAPQQVPGIKFNIRPRQPHHDLPPC3GSGVRF 
PEFVQYLLDVHRF^MDIHWDHVSRLCSPCLIDYDPVGKFESMEaDDANFFLSLIRAPRNLTFPRFKD 
RH S Q EARTT AR I AHQ YFAQL S ALQRQRTYDF YYMD YLMFNY SK P FTDL Y 



Further analysis of the NOV30a protein yielded the following properties shown in 
Table 30B. 



Table 30B. Protein Sequence Properties NOV30a 


PSort analysis: 


0.8650 probability located in lysosome (lumen); 0.8200 probability located in 
outside; 0.3657 probability located in microbody (peroxisome); 0.1000 
probability located in endoplasmic reticulum (membrane) 


SignalP analysis: 


Cleavage site between residues 38 and 39 



A search of the NOV30a protein against the Geneseq database, a proprietary 
database that contains sequences published in patents and patent publication, yielded 
several homologous proteins shown in Table 30C. 
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Table 30C. Geneseq Results for NOV30a 


Geneseq 
Identifier 


Protein/Organism/Length 
[Patent*, Date] 


NOV30a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for 
the Matched 
Region 


Expect 
Value 


ABB53266 


Human polypeptide #6 - 
Homo sapiens, 424 aa. 
[WO200181363-A1, 
01-NOV-2001] 


62..183 

303..424 ! 


121/122 (99%) 
121/122 (99%) 


4e-69 


ABB53265 


Human polypeptide #5 - 
Homo sapiens, 628 aa. 
[WO200181363-A1, 
01-NOV-2001] 


62..183 
507..'628 


121/122 (99%) 
121/122 (99%) 


4e-69 


AAE15437 


Human drug metabolising 
enzyme (DME)-4 - Homo 
sapiens, 396 aa. 
[WO200179468-A2, 
25-OCT-2001] 


62..183 
275.396 


121/122 (99%) 
121/122 (99%) 


4e-69 


AAB85083 


Human interleukin-6 (IL-6) 
like polypeptide - Homo 
sapiens, 171 aa. 
[WO200142484-A1, 
14-JUN-2001] 


62..183 
50..171 


121/122 (99%) 
121/122 (99%) 


4e-69 


AAM24429 


Murine EST encoded protein 
SEQ ID NO: 1954 - Mus 
musculus, 424 aa. 
[WO200154477-A2, 
02-AUG-2001] 


62..183 
303..424 


121/122 (99%) 
121/122 (99%) 


4e-69 



In a BLAST search of public sequence datbases, the NOV30a protein was found to 
5 have homology to the proteins shown in the BLASTP data in Table 30D. 



Table 30D. Public BLASTP Results for NO V30a 


Protein 

Accession 

Number 


Protein/Organism/Length 


NOV30a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for 
the Matched 
Portion 


Expect 
Value 


Q9H3N2 


GalNAc 4-sulfotransferase 
(GalNAc-4-O-sulfotransferase 1) 
(Carbohydrate (N-acetylgalactosamine 
4-0) sulfotransferase 8) (Hypothetical 
48.8 kDa protein) - Homo sapiens 
(Human), 424 aa. 


62..183 
303..424 


121/122 (99%) 
121/122(99%) 


le-68 
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Q9H2A9 


N-acetylgalactosamine-4-O-sulfotrans 
ferase - Homo sapiens (Human), 424 
aa. 


62..i 8 r c 

303..424 


120/122 (98%) 




Q9BXH4 


GalNAc-4-sulfotransferase 2 - Homo 
sapiens (Human), 443 aa. 


62.. 179 
325..442 


77/118(65%) 
95/118(80%) 


le-44 


Q9BXH3 


GaINAc-4-sulfotransferase 2 - Homo 
sapiens (Human), 358 aa. 


62..179 
240..357 


77/118(65%) 
95/118(80%) 


le-44 


Q9BZW9 


N-acetylgalactosamine 
4-O-sulfotransferase 2 GalNAc4ST-2 
- Homo sapiens (Human), 438 aa. 


62..179 
320..437 


77/118 (65%) 
95/118(80%) 


le-44 



Example 31, 

The NOV31 clone was analyzed, and the nucleotide and encoded polypeptide 
5 sequences are shown in Table 3 1 A. 



Table 31A. NOV31 Sequence Analysis 


|SEQIDNO:133 |2325bp J 


NOV31a, 
CG149008-01 
DNA Sequence 


CCCAGGCCGGACAAGCGTCCCGAAAGCCCCGGGAGAGACTAAGAAGCAATCCTCCCACGCGCTTTCT 


CCCACCCTCGGGCCACTGAGACGGAGGGACAGAGGGCCGCCCTCGCGCGGCCGAGGCCCCGCCTCCC 


GCTCGCCCGCCCGCGCCTCCAGCGGAAGCCGGAAGCAAAAGCGGGTCCTGCTAGCCCCGCGGCTCCG 


AACTCGGTGGTCCTGGAAGCTCCGCAGGATGGGGGAGAAGATGGCGGAAGAGGAGAGGTTCCCCAAT 


ACAACTCATGAGGGTTTCAATGTCACCCTCCACACCACCCTGGTTGTCACGACGAAACTGGTGCTCC 

CGACCCCTGGCAAGCCCATCCTCCCCGTGCAGACAGGGGAGCAGGCCCAGCAAGAGGAGCAGTCCAG 

CGGC ATGACC ATTTTC TTC AGCC TCCTTGTC CTAGCTATCTGC ATCATATTGGTGC ATTTACTGATC 

CGATACAGATTACATTTCTTGCCAGAGAGTGTTGCTGTTGTTTCTTTAGGTATTCTCATGGGAGCAG 

TTATAAAAATTATAGAGTTTAAAAAACTGGCGAATTGGAAGGAAGAAGAAATGTTTCGTCCAAACAT 

GTTTTTCC T CCTCCTGCTTCCCC CTATTATCTTTGAGTCTGGATATTC ATTACAC AAGGTGAG ACTC 

AGGCACACATTGGGTAACTTCTTTCAAAATATTGGTTCC^TCACCCTGTTTGCTC 

CAATCTCCGCTTTTGTAGTAGGTGGAGGAATTTATTTTCTGGGTCAGGCTGATGTAATCTCTAAACT 

CAACATGACAGACAGTTTTGCGTTTGGCTCCCTAATATCTGCTGTCGATCCAGTGGCCACTATTGCC 

ATTTTC AATGCACTTCATGTGGACCCCGTGCTCAACATGCTGGTCTTTGGAGAAAGTATTCTCAAC 

ATGC AGTCTCC ATTGTTC TG ACC AACACAG C TGAAGGTTTAAC AAGAAAAAATATGTCAGATGTC AG 

TGGGTGGCAAACATTTTTACAAGCCCTTGACTACTTCCTCAAAATGTTCTTTGGCTCTGCAGCGCTC 

G G C AC TC TC AC TGGC T T AATTTC TGCATT AG TGC TG AAGCAT ATTGAC TTG AGG AAAACGCC TT C C T 

TGGAGTTTGGCATGATGATCATTTTTGCTTATCTGCCTTATGGGCTTGGAGAAGGAA 

AGGCATCATGGCCATCCTGTTCTCAGGCATCGTGATGTCCCACTACACGCACCATAACCTCTCCCCA 

GTCACCC AGATCC TC ATG C AGC AGACCCTCCGC ACCGTGGCCTTCTTATGTGAAAC ATGTGTGTTTG 

CATTTCTTGGCCTGTCCATTTTTAGTTTTCCTCACAAGTTTGAAATTTCCTTTGTCA 

AGTGCTTGTACTATTTGGCAGAGCGGTAAACATTTTCCCTCTTTCCTACCTCCTGAATTTCTT 

GATCATAAAATCACACCGAAGATGATGTTCATCATGTGGTTTAGTGGCCTGCGGGGAGCCATCCCCT 

ATGCCCTGAGCCTACACCTGGACCTGGAGCCCATGGAGAAGCGGCAGCTCATCGGCACCACCACCAT 

CGTCATCGTGCTCTTCACCATCCTGCTGCTGGGCGGCAGCACCATGCCCCTCATTCGCCTCATGGAC 

ATC G AGGAC GC C AAG GCAC AC CGCAGGAACAAG AAGG ACG TC AACCTC AGC AAGAC TG AG AAGATGG 

GCAACACTGTGGAGTCGGAGCACCTGTCGGAGCTCACGGAGGAGGAGTACGAGGCCCACTACATCAG 

GCGGCAGGACCTTAAGGGCTTCGTGTGGCTGGACGCCAAGTACCTGAACCCCTTCTTCACTCGGAGG 

CTGACGCAGGAGGACCTGCACCACGGGCGCATCCAGATGAAAACTCTCACCAACAAGTGGTACGAGG 

HCXVPHCacc Ar^^f!f?r!TrC!GGrTCCGAGGACGACG AGCAGGAGC TGC TCTGACGCC AGGTGCCAAG 

GCTTCAGGCAGGCAGGCCCAGGATGGGCGTTTGCT<X:GCACAGA(^CTCAGCAGGGGCCTCGCA 


ATGCGTGCATCCAGCAGCCCCTTCAAGACATAAGAGGGCGGGGCGAGGTACTGGCTGCAGAGTCGCC 


TTAGTCCAGAACCTGACAGGCCTCTGGAGCCAGGCGACTTCTTGGGAAACTGTCATCTCCCGACTCC 


TCCCTGAGCCAGCCTCCGCTCAGTGTGGCTCCTCAGCCCACAGAGGGGAGGGAGCATGGGGCCAGGT 


GCCAGTCATCTGTGAAGCTAGGGCGCCTACCCCCCCACCCGGAGGAC 




ORF Start: ATG at 230 | |ORF Stop: TGA at 1994 
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SEQ ID NO: 134 1588 aa jMW at 66297. lkD 


NOV31a, 
CG149008-01 
Protein Sequence 


MGEKMAEEERFPOTTHEGFITVTLHTTLVVTTKLVLPTPGKP 
VLAICIILVHLLIRYRLHFLPESVAVVSLGILMGAVI^ 

IFESGYSLHKVRLRHTLGNFFQNIGSITLFAVFGTAISAFWGGGIYPLGQADVISKLNMTDSFAFG 
SL I S AVDPVAT I AI FNALHVDPVLNMLVFGE S ILNDAVS I VLTNTAEGLTRKNMSDVSGWQTFLQAL 
DYFLKMFFGSAAI^TLTGLISALVLKHIDLWCTPSLEFGMMIIFAYLPYGLAEGISLSGIMAILFSG 
I VMSHYTHHNLSPVTQ ILMQQTLRTVAFLCETCVFAFLGLS IFSFPHKFE I SFVIWC I VLVLFGRAV 
NIFPLSYLLNFFRDHKITPKMMFIMWFSGLRGAIPYALSLHLDLEPMEKRQLIGTTTIVIVLFTILL 
IX3GSTMPLIRLMDIEDAKAHRRNKKDVNL SKTEKMGNTVESEHLSELTEEE YEAHY IRRQDLKGFVW 
LDAKYLNPFFTRRLTQEDLHHGR IQMKTLTIQKWYEEVRQGPSG SEDDEQELL 



5 

Further analysis of the NOV31a protein yielded the following properties shown in 
Table 3 IB. 



Table 31B. Protein Sequence Properties NOV31a 


PSort analysis: 


0.8000 probability located in plasma membrane; 0.4000 probability located in 
Golgi body; 0.3000 probability located in endoplasmic reticulum (membrane); 
0.3000 probability located in microbody (peroxisome) 


SignalP analysis: 


Cleavage site between residues 40 and 41 



10 

A search of the NOV31a protein against the Geneseq database, a proprietary 
database that contains sequences published in patents and patent publication, yielded 
several homologous proteins shown in Table 31C. 
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Table 31C. Geneseq Results for NOV31a 


Geneseq 
Identifier 


Protein/Organism/Length 
[Patent #, Date] 


NOV31a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for the 
Matched Region 


Expect 
Value 


ABG61535 


Human transporter and ion 
channel, TRICH5, Incyte ID 
7476938CDl-Homo 
sapiens, 671 aa. 
[WO200240541-A2, 
23-MAY-2Q02] 


1..588 
91..671 


581/588 (98%) 
581/588 (98%) 


0.0 


AAM24062 


Human EST encoded protein 
SEQ ID NO: 1587 -Homo 
sapiens, 315 aa. 
[WO200154477-A2, 
02-AUG-2001] 


274..588 
1..315 


315/315 (100%) 
315/315 (100%) 


0.0 
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AAB29621 


Cat flea HMT Na/H 
transporter, SEQ ID 
NO: 1868 - Ctenocephalides 
felis, 608 aa. 
[WO200061621-A2, 


—tp 

8..584 
33..602 


329/585 (56%) 1 
416/585 (70%) 


-=3» j. jy 
e-175 


ABB59364 


Drosophila melanogaster 
polypeptide SEQ ID NO 
4884 - Drosophila 
melanogaster, 649 aa. 

27-SEP-2001] 


44..S87 
86..63S 


310/562 (55%) 
399/562(70%) 


e-170 


AA014196 


Human transporter and ion 
channel TRICH-13 - Homo 
sapiens, 631 aa. 
[WO200204520-A2, 
17-JAN-2002] j 


117-547 
125..542 


166/439 (37%) 
253/439 (56%) 


2e-72 


In a BLAST search of public sequence datbases, the NOV31a protein was found to 
have homology to the proteins shown in the BLASTP data in Table 3 ID. 


Table 31D. Public BLASTP Results for NOV31a 


Protein 

Accession 

Number 


Protein/Organism/Length 


NOV31a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for the 
Matched Portion 


Expect 
Value 


BAA76783 


KIAA0939 protein - Homo 
sapiens (Human), 595 aa 
(fragment). 


1..588 
15..595 


581/588 (98%) 
581/588 (98%) 


0.0 


Q8R4D1 


Na-H exchanger isoform 
NHE8 - Mus musculus 
(Mouse), 576 aa. 


5..S87 
1..575 


556/583 (95%) 
565/583 (96%) 


0.0 


Q9Y507 


DJ963K23.4 (Continues in 
dJ1041C10 (AL162615))- 
Homo sapiens (Human), 437 
aa (fragment). 


152..588 
1..437 


437/437 (100%) 
437/437 (100%) 


0.0 


Q9Y2E8 


KIAA0939 protein - Homo 
sapiens (Human), 41 1 aa 
(fragment). 


182..588 
5..411 


405/407 (99%) 
406/407 (99%) 


0.0 


AAH34508 


Hypothetical protein - Mus 
musculus (Mouse), 388 aa 
(fragment). 


209..587 
9..387 


366/379 (96%) 
374/379 (98%) 


0.0 
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PFam analysis predicts that the N0V31a protein SoSo^f^dSMM^^h^iSB" 
Table 31E. 



Table 31E. Domain Analysis of NOV31a 


Pfam Domain 


NOV31a Match 
Region 


Identities/ 
Similarities 

for the Matched Region 


Expect Value 


Na_JiJExchanger 


62..485 


141/465 (30%) 
345/465 (74%) 


3.1e-98 
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Example 32. 

The NOV32 clone was analyzed, and the nucleotide and encoded polypeptide 
sequences are shown in Table 32A. 
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Table 32A. NOV32 Sequence Analysis 




SEQIDNO:135 


367 bp | 


NOV32a, 
CG149350-01 
DNA Sequence 


ATGGCGGGGAGAAGGAAGCTCATCGCAGTGATCAGAGACAAGGACACGGTGACTGGTTTCCTGCTGG 
GCAGCATAGGGGAGCTTAACAAGAACTGCCACCCCAATTTCCTGGTGGTGGAGAAGGATACGACCAT 
CAATGAGATCGAAGACACTTTCCGGCAATTTCTAAACCGGGATGACACTGGCATCATCCTCATCAAC 
CAGTAC ATCGCAGAGATGGTGC AGCATGCCC TGGACACCCACCAGC ACTCTATCCC TACTGTCCTGG 
AGATCCCCTCCAAGGAGCACCCATATGAGGACGCCAAGGACTCCACCCTGCGGAGGGCCAGGGGCAT 
GTTCAC TGCCG AAG AC CTGTGCTAGGGTCTTT 




ORF Start: ATG at 1 


jORF Stop: TAG at 358 



15 





SEQIDNO: 136 |ll9aa 


MW at 13566.3kD 


NOV32a, 
CG149350-01 
Protein Sequence 


MAGRRKLIAVIRDKDTVTGFLLGSIGELNKN^ 

QYI AEMVQHALDTHQHS I PTVLE I PSKEHPYEDAKDS TLRRARGMFTAEDLC 



20 





SEQIDNO: 137 |367bp [ 


NOV32b, 
CG149350-02 
DNA Sequence 


ATGGCGGGG AG AAGG AAG C TCATCGC AGTG ATCAG AG AC AAGG ACACGGTG AC TGGTTTCC TGC TGG 

GCAGCATAGGGGAGCTTAACAAGAACTGCCACCCCAATTTCCTGGTGGTGGAGAAGGATACGACCAT 

CAATGAGATCGAAGACACTTTCCGGCAATTTCTAAACCGGX3ATGACACTGGCJVTCATCCTC 

CAGTACATCGCAGAGATGGTGX^GCATGCCCTGGACACCCACC^GCACTCTATCCCTACTGTCCTGG 

AG ATC C CCT CC AAGGAGC ACCC ATATGAGG ACGCC AAGGAC TCCAC C C TG CGGAGGGC CAGGGGCAT 

GTTCACTGCCGAAGACCTGTGCTAGGGTCTTT 




ORF Start: ATG at 1 |ORF Stop: TAG at 358 
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SEQIDNO:138 . jll9aa |MW at 13566.3kD 


NOV32b, 
CG149350-02 
Protein Sequence 


MAGRRKLIAVIRDKDTVTGFLLGSIGELNKNCHPNFLVVEKDTTINEIEDTFRQFLNRDDTGIILIN 
QYIAEMVQHALDTHQHSIPTVLEIPSKEHPYEDAKDSTLRRARGMFTAEDLC 



5 

Sequence comparison of the above protein sequences yields the following sequence 
relationships shown in Table 32B. 
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Table 32B. Comparison of NOV32a against NOV32b. 


Protein Sequence 


NOV32a Residues/ 
Match Residues 


Identities/ 

Similarities for the Matched Region 


NOV32b 


1..119 
1..119 


119/119(100%) 
119/119(100%) 



Further analysis of the NOV32a protein yielded the following properties shown in 
Table 32C. 

15 



Table 32C. Protein Sequence Properties NOV32a 


PSort analysis: 


0.4852 probability located in mitochondrial matrix space; 0.4500 probability 
located in cytoplasm; 0.1957 probability located in mitochondrial inner 
membrane; 0.1957 probability located in mitochondrial intermembrane space 


SignalP analysis: 


No Known Signal Sequence Predicted 



A search of the NOV32a protein against the Geneseq database, a proprietary 
20 database that contains sequences published in patents and patent publication, yielded 
several homologous proteins shown in Table 32D. 



Table 32D. Geneseq Results for NOV32a 


Geneseq 
Identifier 


Protein/Organism/Length 
[Patent #, Date] 


NOV32a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for 
the Matched 
Region 


Expect 
Value 
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AAW27337 


Human vacuolar ATPase 14 
kDa subunit hV-14B - Homo 
sapiens, 119 aa. 
[JP09168390-A, 
30-JUN-1997] 


Pf 

1..118 

1..118 


it/uouil!* 

105/118(88%) 
108/118(90%) 


2e-54 


AAW27336 


Human vacuolar ATPase 14 
kDa subunit hV-14A - Homo 
sapiens, 119aa. 
[JP09168390-A, 
30-JUN-19971 


1..118 
1.-118 


104/118(88%) 
107/118(90%) 


8e-54 


ABB62928 


Drosophila melanogaster 
polypeptide SEQ ID NO 
15576 -Drosophila 
melanogaster, 124 aa. 
rWO200171042-A2, 
27-SEP-2001] 


6..118 
10.. 122 


71/113(62%) 
91/113(79%) 


2e-38 


ABB57798 


Drosophila melanogaster 
polypeptide SEQ ID NO 186 
- Drosophila melanogaster, 
124 aa. [WO200171042-A2, 
27-SEP-2001] 


6.. 114 
10..118 


58/109 (53%) 
84/109 (76%) 


7e-29 


AAG35989 


Zea mays protein fragment 
SEQ ID NO: 44042 - Zea 
mays subsp. mays, 130 aa. 
[EP1033405-A2, 
06-SEP-2000] 


1..118 
1-125 


56/125 (44%) 
85/125 (67%) 


le-27 



In a BLAST search of public sequence datbases, the NOV32a protein was found to 
have homology to the proteins shown in the BLASTP data in Table 32E. 
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Table 32E. Public BLASTP Results for NOV32a 




Protein 

Accession 

Number 


Protein/Organisin/Length 


NOV32a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for 
the Matched 
Portion 


Expect 
Value 


P5Q408 


Vacuolar ATP synthase 
subunit F (EC 3.63.14) 
(V-ATPase F subunit) 
(Vacuolar proton pump F 
subunit) (VrATPase 14 kDa 
subunit) - Rattus norvegicus 
(Rat), 119aa. 


1..118 
1..118 


104/118 (88%) 
108/118(91%) 


le-53 
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Q16864 


Vacuolar ATP synthase 
subunit F (EC 3.6.3. 14) 
(V-ATPase F subunit) 
{ vacuolar proton pump r 
subunit) (V-ATPase 14 kDa j 
subunit) - Homo sapiens 
(Human), 1 19 aa. 




1..118 
1-118 


104/118(88%) 
107/118(90%) 


w ' w *wW#* mmav i » 

2e-53 


Q9D1K2 | 


1 110004G16Rik protein - 
Mus musculus (Mouse), 119 
aa. 


1..118 
1..118 


103/118(87%) 
108/118(91%) 


5e-53 


Q28029 


Vacuolar ATP synthase 
subunit F (EC 3.6.3.14) 
(V-ATPase F subunit) 
(Vacuolar proton pump F 
subunit) (V-ATPase 14 kDa 
subunit) - Bos taurus 
(Bovine), 1 10 aa (fragment). 


10..118 
1..109 - 


97/109 (88%) 
100/109 (90%) 


7e-50 


Q9I8H3 


Vacuolar ATP synthase 
subunit F (EC 3.6.3.14) 
(V-ATPase F subunit) 
(Vacuolar proton pump F 
subunit) (V-ATPase 14 kDa 
subunit) - Xenopus laevis 
(African clawed frog), 110 aa 
(fragment). 


10..118 
1..109 


83/109 (76%) 
94/109 (86%) 


le-43 



PFam analysis predicts that the NOV32a protein contains the domains shown in the 
Table 32F. 
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Table 32F. Domain Analysis of NOV32a 


Pfam Domain 


NOV32a Match Region 


Identities/ 
Similarities 
for the Matched 
Region 


Expect Value 


ATP-synt„F 


8..108 


51/107(48%) 
90/107 (84%) 


9.2e-43 



Example 33. 

10 The NOV33 clone was analyzed, and the nucleotide and encoded polypeptide 

sequences are shown in Table 33A. 
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Table 33A. NOV33 Sequence Analysis 




SEQIDNO:139 |l510bp | 


NOV33a, 
CG149463,01 
DNA Sequence 


ATGGGTTCAG ACTTTATGCC CTG AAAAGATCCTTCCAG C CCTGGCCATC TTGGACTTCTGGAGCTAr 


CCTGGCTCACAGGGGTCTTGTTGCCCTGGGTGTCCCCAGTTCTTGAAAAGAATCAGCCTGGGAGGGG 


CCACACCCTGACCATCCCCCTTTATCCCTTCTGAGATGTTTGTTAGGAAGTCTGGGTCCAGGGGATA 


TCATTTCTTGTTCCATCCATGCAGGGGTTGCTTACCTCGGGTAGGAAACCCTCAGGCGGTGGnAGnT 

GCACAGGTAGGGGAGGATGGAGAGGGCAGTGGTGCCTGAAGCCCTGGATGGGCGGAGCTGACCCCCC 

AACACCAACTCTATCATGCCTGCTCCTCCCTGTCCCCCCAGAGCTGCCTGATCATTGCTACAGAATG 

AACTCTAGCCCAGCTGGGACCCCAAGTCCACAGCCCTCCAGGGCCAATGGGAACATCAACCTGGGGC 

CTTC AGC C AACCC AAATGCCCAGCCC ACGGACTTCGACTTCCTC AAAG TC ATCGGC AAAGGG AACTA 

CGGGAAGGTCCTACTGGCCAAGCGCAAGTCTGATGGGGCGTTCTATGCAGTGAAGGTACTACAGAAA 

AAGTCCATCTTAAAGAAGAAAGAGCAGAGCCACATCATGGCAGAGCGCAGTGTGCTTCTGAAGAACG 

TGCGGCACCCCTTCCTCGTGGGCCIXX^GCTACTCCTTCCAGACACCTGAGAAGCTCTA 

CGACTATGTCAACGGGGGAGAGCTCTTCTTCCACCTGCAGCGGGAGCGCCGGTTCCTGGAGCCCCGG 

GCCAGGTTCTACGCTGCTGAGGTGGCCAGCGCCATTGGCTACCTGCACTCCCTCAACATCATTTACA 

GGGATCTGAAACCAGAGAACATTCTCTTGGACTGCCAGTACTTGGCACCTGAAGTGCTTCGGAAAGA 

GCCTTATGATCGAGCAGTGGACTGGTGGTGCTTGGGGGCAGTCCTCTACGAGATGCTCCATGGCCTG 

CCGCCCTTCTACAGCCAAGATGTATCCCAGATGTATGAGAACATTCTGCACCAGCCGCTACAGATCC 

CCGGAGGCCGGACAGTGGCCGCCTGTGACCTCCTGCAAAGCCTTCTCCACAAGGACCAGAGGCAGCG 

GCTGGGC TCC AAAGC AG AC TTTCTTGAGATTAAG AACC ATG TATTCTTCAGCC C C ATAAACTGGG AT 

GACC TGTACCACAAGAGGCTAACTCCACCCTTC AACCC AAATGTGACAGGACC TGCTGACTTGAAGC 

ATTTOGACCCAGAGTTCACCCAGGAAGCTGTGTCCAAGTCCATTGGCTGTACCCCCGACACTGTGGC 

CAGCAGCTCTGGGGCCTCAAGTGCATTCCTGGGATTTTCTTATGCGCCAGAGGATGATGACATCTTG 

GATTGTTAGAAGAGAAGGGCCTGTGAAACTACTGAGGCCAGCTGGTATTAGTAAGGAATTACCTTCA 


GCTGCTAGGAAGAGCGACTCAAACTAACAATGGCTT 




ORF Start: ATG at 220 | loRF Stop: TAG at 1414 





SEQ ID NO: 140 |398 aa |MW at 44552.5kD 


NOV33a, 
CG149463-01 
Protein Sequence 


MQGLLTSGRK P SGGGRCTGRGGWRGQWCLKPWMGGADP PTPTLSCLLLPVPPEL PDHCYRMNS S PAG 

TPSPQPSRANGNINLGPSANPNAQPTDFDFLKVIGKGNYGKVLIAKRKSDGAFYAVKVLQKKSIL 

KEQSHIMAERSVLLKNVRHPFLVGLRYSFQTPEKLYFVLD^^ 

EVAS AIGYLHS LNI I YRDLKPENI LLDCQYLAPEVLRKEPYDRAVDWWCLGAVL YEMLHGLPPFYSQ 
DVSQMYENILHQ PLQI PGGR WAACDLLQSLLHKDQRQRLGSKADFI*EIKNHVFFSPINWDDLYHKR 
LTPPFNPNVTGPADLKHFDPEFTQEAVSKSIGCTPD'rVASSSGASSAFIiGFSYAPEDDDILDC 



Further analysis of the NOV33a protein yielded the following properties shown in 
Table 33B. 



Table 33B. Protein Sequence Properties NOV33a 


PSort analysis: 


0.4500 probability located in cytoplasm; 0.2677 probability located in 
microbody (peroxisome); 0.1859 probability located in Iysosome (lumen); 
0.1000 probability located in mitochondrial matrix space 


SignalP analysis: 


No Known Signal Sequence Predicted 
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A search of the NOV33a protein against the Geneseq database, a proprietary 
database that contains sequences published in patents and patent publication, yielded 
several homologous proteins shown in Table 33C. 
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Table 33C. Geneseq Results for NOV33a 




Geneseq 
Identifier 


Protein/Organism/Length 
[Patent*, Date] 


NOV33a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for 
the Matched 
Region 


Expect 
Value 


AAY95276 


Human serum and 
glucocorticoid-induced 
protein kinase 2-beta - Homo 
sapiens, 427 aa. 
[WO200035946-A1, 
22-JUN-2000] 


1..398 
1..427 


398/427 (93%) 
398/427 (93%) 


0.0 


AAM25594 


Human protein sequence 
SEQ ID NO: 1 109 - Homo 
sapiens, 382 aa. 
[WO200153455-A2, 
26-JUL-2001] 


53..398 
8..382 


346/375 (92%) 
346/375 (92%) 


0.0 


AAE22765 


Human serum and 
glucocoticoid-induced 
protein kinase, SGK2-alpha - 
Homo sapiens, 367 aa. 
[WO200224947-A2, 
28-MAR-2002] 


61..398 
1..367 


338/367 (92%) 
338/367 (92%) 


0.0 


AAB65708 


Novel protein kinase, SEQ 
ID NO: 237 - Homo sapiens, 
367 aa. [WO200073469-A2, 
07-DEC-2000] 


61..398 
1..367 


337/367 (91%) 
338/367(91%) 


0.0 


AAB65615 


Novel protein kinase, SEQ 
ID NO: 141 - Mus museums, 
244 aa. [WO200073469-A2, 
07-DEC-2000] 


184..398 
1..244 


215/244 (88%) 
215/244(88%) 


e-122 



In a BLAST search of public sequence datbases, the NOV33a protein was found to 
have homology to the proteins shown in the BLASTP data in Table 33D. 

10 



Table 33D. Public BLASTP Results for NOV33a 
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Protein 

Accession 

Number 


Protein/Organism/Length 


NOV33a ^ 
Residues/ 
Match 
Residues 


Similarities for 
the Matched 
Portion 


** ^r«t> »&• jzto j? ~L 

Expect 
Value 


Q9HBY8 


Protein kinase - Homo sapiens 
^iiurnanj, 4Z/ aa. 


1..398 


398/427 (93%) 
J70/HZ. / yyj foj 


0.0 


Q9UKG6 


Protein kinase (DJ138B7.2) 
(Semm/glucocorticoid 
reguiaiea Kinase ^oirruiar 10 
serum/glucocorticoid 
regulated kinase 2) - Homo 
sapiens (Human), 367 aa. 


6L.398 
1..367 


338/367 (92%) 
338/367 (92%) 


0.0 


Q8R0P6 


Serum/glucocorticoid 
regulated kinase 2 - Mus 
musculus (Mouse), 366 aa. 


61..397 
1..365 


317/366 (86%) 
326/366 (88%) 


0.0 


073927 


S-sgk2 - Squalus acanthias 
(Spiny dogfish), 594 aa. 


70..396 
236..594 


235/359 (65%) 
277/359(76%) 


e-133 


073926 


S-sgkl - Squalus acanthias 
(Spiny dogfish), 433 aa. 


61..396 
60..433 


239/374 (63%) 
282/374 (74%) 


e-132 



PFam analysis predicts that the NOV33a protein contains the domains shown in the 
Table 33E. 
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Table 33E. Domain Analysis of NOV33a 


Pfam Domain 


NOV33a Match Region 


Identities/ 
Similarities 

for the Matched Region 


Expect Value 


pkinase 


95..228 


54/135 (40%) 
116/135(86%) 


5e-39 


pkinase 


231..323 


35/128 (27%) 
69/128 (54%) 


1.5e-21 


pkinase_C 


324..393 


25/73 (34%) 
47/73 (64%) 


3.1e-15 



Example 34. 

10 The NOV34 clone was analyzed, and the nucleotide and encoded polypeptide 

sequences are shown in Table 34A. 
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Table 34A. NOV34 Sequence Analysis 




SEQIDNO: 141 |2152 bp 




GGGGGGCCTGAGCCTCTCCGCCGGCGCAGGCTCTGCTCGCGCCAGCTCGCTCCCGCAGCCATGCCCA 


NOV34a, 
CG149536-01 
DNA Sequence 


CCACCATCGAGCGGGAGTTCGAAGAGTTGGATACTCAGCGTCGCTGGCAGCCGCTGTACTTGGAAAT 
TCGAAATGAGTCCCATGACTATCCTCATAGAGTGGCCAA i t-GAAAt, AGA 
m7\ r» -iv r» * a nv*»rn * tv f*r*r*r* a t a tp a TP a P aPTPPTPTT AAACTGC AAAATGC TG AGAATGATTATATTA 
_ m — • /^mirim * /^/iunr» a p a t a p a a papy^APaAAPPAfiTTAPATPTTAACACAGGGACCACTTP.P.'TAA 
CACATGCTGCCATTTCTGGCTTATGGTTTGGCAGCAGAAGACCAAAGCAGTTGTCATGCTGAACCGC 
ATTGTGGAG AGAGAATCG AGTGGTGAAACC AGAACAATATCTC AC TTTC ATTAT AC TAC CTGGCCAG 

_ _ ,->mnn/-imn 7v Tv mr> nrT'Tr' a TTTPTP a aTTTPTTPTTT A a ARTR AG AG A ATPTRR f"PT 
ATTTTGGAGTCCCTGAAIv-Al-C.AG\-i XLnl 1 lUiv/inX 1 lv~x i«i i x t\txt\\z x \jr\\jn\jnn x v^, x a ^ 

„ rrir _^ > » _ oorvn -. pr A aipr-rprarrrrTP aTPP aPTP-Taf^TRP AP.RP ATTGGGCGCTCTGGC ACCTTC 
CTTGAACCCTGACCAxGoGv— >- 1 vjL-ovj 1 bn ± V^v-/\v_ l\j l AVs l w wl\3Uv.ft i x\s\jvjv«vjv- a\- x yjvj\»nv.\« x x v- 

mi-i m^im/i <-»m n n <"» tv r»fnrr»r' nv h^ov^ttttp a TPPa a a A AP/2APATPATATTaap aTAAAACAAGTGTTAC 

TC TCT GGTAGAv- Al— X 1 b 1L 1 xG 1111 vjtt X 0\»tt*Vi/TAV9VJ«.VXf». X Vjx\ ini x wv^*»i nfwiv.nno a \j a j. 

_ _ _ — _ — - «. • TvmTv r*r*r*-* *r?v^/^i^nv^TTaTTpapappppAPATPA aptpapattptpatap ATGGf 
TGAAC ATGAGAAAAT ACCGAA XajGG It- 1 1 A 1 1\-A\jAL.ui~ uioH l uuiv, lunon hv,iv-ai /iv-/* x vjuv* 

T AT AATAGAAGGAGCAAAATGTA 1 AAAGGGAGA ilLl Av» a a X al^vjamw-ij /* x\j\j/*m/*u/\/t.u iiiv.i 
_ _ _ , _ ni/ <i »nn/ , i[Ti/>r<r<iT<miT<PTi <TV , 7AT»nv" , a/" , Pa a in &217iTll a TP APTPA A A A ATAP& ATPPP A 

AAGGAAGACTTATCTCCTGCCTTTGATC-Al i\^v-v-AAAL.AAAir»ii*/ix\j/T»v- i\3.tt/*/Tj*/7.Attv-/*/T»A\j<j\j/T> 

, * m-M /srimnmTioT. Tvr»Tv T\/~»a a a a irTfiup app.TP.aPPP. aTGT AP APP APTTTCCTCTAAAATGCA 
AC AGAATAGGT CT AGAAGAAGAAAAAL. 1 wil-.tt.ws 1 vj/w^v- vj/t» x vj x /»A-/40\3.rvv- x x x x v. x nnnn x wv,n 

■» _ _ _ _ tv <-Hi^ ^im/^Tv/*»Tv/*'fTW"iti^ , rn7ArT^va a APPTaTTPPapappaP APA A APPPPAPP 
AGATAC AATGGAGGAGAACAGTGAGAGxX5CTCTAt-GGAAAv-G 1 A 1 1 ui^v^vjvjttv-ttvjtt/^ijvjv- v^/*v,v- 

_ mr^Tv tv ivr^ A<^Tv/^r»/^rpTv TAafivapaaTpaappaaaaaPAa A aapptppt 

ACAGCTCAGAAGGTGCAGC AGATGAAAC AGAGGCTAAATGAGAA l GAAUGAAAAtt^/rJ\/4Att\jvj l vju i 

A ^ . » _ — — iniiusiiiM« run* * m/^/^/*»/^nlfTtfmv rTW^rrtr , 7V/'* , 'P/ -, 'aTTTT/ , 3PTTPPPPPTTTTPTTPP 

TATATTGGCAACCTATTCTCACTAAGATGGGGTTTAxTjTCAGACAI 1 1 1GG1 ivjijLijv. 1 1 1 rvsi ivjva 
n«prr »p 7A ptpttttttpapP A A A ATGCCCTATAAACAATTAATTTTGCCCAGCAAGCTTCTGCACTA 


GTAACTGACAGTGCTACATTAATCATAGGGGTTTGTCTGCAGCAAACGCCTCATATCCCAAAA^ 


TCCAGTAGAATAGACATCAACCAGATAAGTGATATTTACAGTCACAAGCCCAACATCTCAGGACTCT 


TGACTGCAGGTTCCTCxXjAACCCCAAACTGTAAATGCtCTGTCTAAAATAAAGACATTCATGTTTC 


AAAAACTGGTAAArrTTGCAACTGTATTCATACATGTCAAACACAGTA^ 

agatatcctttatcacaggatttgtttttggaggctatctggattttaacctgcacttgatataagc 


AATAAATATTGTGGTTTTATC TACGTTATTGGAAAG AAAATGAC ATTTAAATAATGTG TGTAATGTA 


TAATGTACTATTGACATCGGCATCAACACTTTTATTCTTAAGCATTTCAGGGTAAATATATTTTATA 


AGTATC TATTT AATCTTTTGTAGTTAAC TGT ACTTTTTAAGAGCTC AATTTGAAAAATCTGTTACTA 


AAAAAAAAAATTGTATGTCGATTGAATTGTACTGGATACAx^TTCCATTTTTCTAA 


ATAxX3AGCAGTTAGAAGTTGGAATAAGCAAT i I^CTACTATATATTGCATTTCTTTTATGTTTTACAG 


TTTTCCCCATTTTAAAAAGAAAAGC AAACAAAGAAACAAAAGTTTTTCC TAAAAATATC TTTGAAGG 


aaaattctccttactgggatagtcaggtaaacagttggtcaagactttgtaaagaaatt<x;tttctc 


TAAATCCCATTATTGATATGTTTATTTO^ATGAAAATOTCAATGTAGTTGGGGTA^ 


agksaagcaaaagtaagaagcagcattttatgattcataatttcagtttactagactgaagttt^tc 


GTAAACCC 




ORF Start: ATG at 61 j |ORF Stop: TAA at 1 171 





SEQ ID NO: 142 |370 aa |MW at 43248.9kD 


NOV34a, 
CG149536-01 
Protein Sequence 


MPTTIEREFEELOTQRRWQPLYLEIRl^SHDYPHRVAjCFPENRNRNRYRDVSPTOH 

YINASLVDIEEAQRSYILTQGPLPl^CHFWLiyiWQQKTKAVV^^ 

WPDFGVPESPA5FlxNFLPlCVl^SGSLNPDHGPAVIHCSAGIGRSGTFSLVDTCLVL^ 

VLLNMRK YRMGL I QT PD QLRF S YMA 1 1 EG AKC I KGD SSI QKRWKEL S KEDL S P AFDH S PNK IMTEK Y 

NGNRIGLEEEKLTGDRCTGLSSKMQDTMEENSESALRKRIREDRKATTAQKV^ 

RWLYWQPILTKMGFMSVTLVGAFVGWRLFFQQNAL 



Further analysis of the NOV34a protein yielded the following properties shown in 
Table 34B. 



Table 34B. Protein Sequence Properties NOV34a 


PSort analysis: 


0.8500 probability located in endoplasmic reticulum (membrane); 0.4400 
probability located in plasma membrane; 0.3000 probability located in 
nucleus; 0.1000 probability located in mitochondrial inner membrane 
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SignalP analysis: 



No Known Signal Sequence Predicted 



A search of the NOV34a protein against the Geneseq database, a proprietary 
database that contains sequences published in patents and patent publication, yielded 
5 several homologous proteins shown in Table 34C. 



Table 34C. Geneseq Results for NOV34a 



Geneseq 
Identifier 


Protein/Organism/Length 
[Patent #, Date] 


NOV34a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for 
the Matched 
Region 


Expect 
Value 


A A T> 1 A 1 1 A 

AAK14114 


iNon-receptor liwceu protein 
tyrosine phosphatase - Homo 
sapiens, 415 aa. 
[W09113989-A, 
19-SEP-1991] 


1 ^70 
i.. j 1 \i 

1.415 


368/415 (88%) 
369/415 (88%) 


0.0 


AAU91293 


Human NOV8 protein - 
Homo sapiens, 415 aa. 
[WO200216600-A2, 
28-FEB-2002] 


1..370 
1..415 


337/415 (81%) 
345/415 (82%) 


0.0 


ABP41882 


Human ovarian antigen 
HOCPJ87, SEQ ID NO:3014 
- Homo sapiens, 368 aa. 
[WO200200677-A1, 
03-JAN-2002] 


24..336 
5..362 


312/358 (87%) 
313/358 (87%) 


e-178 


AAM25250 


Human protein sequence 
SEQIDNO:765-Homo 
sapiens, 168 aa. 
[WO200153455-A2, 
26-JUL-2001] 


116..269 
14..167 


137/154(88%) 
145/154 (93%) 


le-77 


AAB56662 


Human prostate cancer 
antigen protein sequence 
SEQEDNO:1240-Homo 
sapiens, 180 aa. 
[WO200055174-A1, 
21-SEP-2000] 


1..124 
29..152 


123/124 (99%) 
124/124 (99%) 


le-69 



10 In a BLAST search of public sequence datbases, the NOV34a protein was found to 

have homology to the proteins shown in the BLASTP data in Table 34D. 
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Table 34D. Public BLASTP Results for NOV34a 


Protein 

Accession 

Number 


Protein/Organism/Length 


NOV34a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for 
the Matched 
Portion 


Expect 
Value 


P17706 


Protein-tyrosine phosphatase, 
non-receptor type 2 (EC 
3.1.3.48) (T- cell 
protein-tyrosine phosphatase) 
(TCPTP) - Homo sapiens 
(Human), 415 aa. 


1..370 
1..415 


369/415 (88%) 
370/415 (88%) 


0.0 


A33899 


protein-tyrosine-phosphatase 
(EC 3.1.3.48), nonreceptor type 
2 - human, 415 aa. 


1..370 
1..415 


368/415 (88%) 
369/415 (88%) 


0.0 


A60345 


protein-tyrosine-phosphatase 
(EC 3.1.3.48) 11A - human, 
387 aa. 


1..336 
1..381 


334/381 (87%) 
335/381 (87%) 


0.0 


Q922E7 


Protein tyrosine phosphatase, 
non-receptor type 2 - Mus 
musculus (Mouse), 406 aa. 


1..365 
1..405 


323/410(78%) 
338/410(81%) 


0.0 


Q06180 


Protein-tyrosine phosphatase, 
non-receptor type 2 (EC 
3.1.3.48) (Protein-tyrosine 
phosphatase PTP-2) (MPTP) - 
Mus musculus (Mouse), 382 
aa. 


1..336 
1..376 


298/381 (78%) 
312/381 (81%) 


e-168 



5 PFam analysis predicts that the NOV34a protein contains the domains shown in the 

Table 34E. 



Table 34E. Domain Analysis of NOV34a 


Pfam Domain 


NO V34a Match Region 


Identities/ 
Similarities 
for the Matched 
Region 


Expect Value 


Y_phosphatase 


42..229 


99/272(36%) 
163/272 (60%) 


5.5e-88 



10 
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Example 35. 

The NOV35 clone was analyzed, and the nucleotide and encoded polypeptide 
sequences are shown in Table 35A. 

5 



Table 35A. NOV35 Sequence Analysis 




SEQIDNO:143 )908bp | 


NOV35a, 
CG149964-01 
DNA Sequence 


CCCTTCTACCCAGAGGGTGAATGGGTATCTTTCCCGGAATAATCCTAATTTTTCTAAGGGTGAAGTT 
TGCAACGGCGGCCGTGACTGTAAGCGGACACCAGAAAAGTACCACTGTAAGTCATGAGATGTCTGGT 
CTGAATTGGAAACCCTTTGTATATGGCGGCCTTGCCTCTATCGTGGCTGAGTTTGGGACTTTCCCTG 
TGGACCTTACCAAAA(^CGACTT<^GGTTCAAGGCCAAAGCATTGATGCCCGTTTCAAAGAGATAM 
ATATAG AGGGATGTTCC ATGCGC TGTTTCGCATC TGTAAAG AGGAAGGTGTATTGGCT CTC TATTCA 
GGAATTGCTCCTGCGTTGCTAAGACAAGCATCATATGGCACCATTAAAATTGGGATTTACCAAAGCT 
TGAAGCGCTTATTCGTAGAACGTTTAGAAGATGAAACTCTTTTAATTAATATGATCTGTGGGGTAGT 
GTCAGGAGTGATATCTTCCACTATAGCCAATCCCACCGATGTTCTAAAGATTCGAATGCAGGCTCAA 
GGAAGCTTGTTCCAAGGGAGCATGATTGGAAGCTTTATCGATATATACCAACAAGAAGGCACCAGGG 
GTCTGTGGAGGGGTGTGGTTCCAAC TGCTC AGCGTGC TGC C ATCGTTGTAGGAGTAGAGCTACCAGT 
CTATGATATTACTAAGAAGCATTTAATATTGTCAGGAATGATGGGACATGTGGATCTCTATAAGGGC 
ACTGTTGATGGTATTTTAAAGATGTGGAAACATGAGGGCTTTTTTGCACTCTATAAAGGATTTTGGC 
CAAACTGGCTTCGGCTTGGACCCTGGAACATCATTTTTTTTATTACATACGAGCAGGTAAAGAGGCT 
TCAAATCTAAGAACTGAATTATATGTGAGCCCAGCAC 




ORF Start: ATG at 21 j 


ORF Stop: TAA at 879 





SEQ ID NO: 144 |286 aa |MW at 32043.5kD 


NOV35a, 
CG149964-01 
Protein Sequence 


MG I FPG I IL IFLRVKFATAAVWSGHQKSTTVSHEMSGLNWKPFVYGGLAS I VAEFGTFFVDLTKTR 
LQVQGQSIDARFKEIKYRGMFHALFRICKEEGVLALYSGIAPALLRQASYGTIKIGIYQSLKRLFVE 
RLEDETLLINMICGWSGVISSTIANPT0VLKIRMQAQGSLFQGSMIGSFIDIYQQEGTRGLWRGVV 
P TAQRAA I WGVEL P VYD I TKKHL I L SGMMGHVDL YKGTVDG I LKMWKH EG F F AL YKGFWPNWLRLG 
PWNI IFFITYEQVKRLQI 








SEQ ID NO: 145 j871 bp | 


NOV35b, 
309326356 DNA 
Sequence 


CACCGGATCCACCATGGGTATCTTTCCCGGAATAATCCTAATTTTTCTAAGGGTGAAGTTTGCAACG 

GCGGCCGTGATTCACCAGAAAAGTACCACTGTAAGTCATGAGATGTCTGGTCTGAATTGGAAACCCT 

TTGTATATGGCGGCCTTGCCTCTATCGTGGCTGAGTTTGGGACTTTCCCTGTGGACCTTACCAAAAC 

ACGACTTCAGGTTCAAGGCCAAAGCATTGATGCCCGTTTCAAAGAGATAAAATATAGAGGGATGTTC 

CATGCGCTGTTTCGCATCTGTAAAGAGGAAGGTGTATTGGCTCTCTATTCAGGAATTGCTCCTGCGT 

TGCTAAGACAAGCATCATATGGCACCATTAAAATTGGGATTTACCAAAGCTTGAAGCGCTT^ 

AGAACGTTTAGAAGATGAAACTCTTTTAATTAATATGATCTGTGGGGTAGTGTCAGGAGTGATATCT 

TCCACTATAGCCAATCCCACCGATGTTCTAAAGATTCGAATGCAGGCTCAAGGAAGCTTGTTCCAAG 

GGAGCATGATTGGAAGCTTTATCGATATATACCAACAAGAAGGCACCAGGGGTCTGTGGAGGGGTGT 

GGTTCCAACTGCTCAGCGTGCTGCCATCGTTGTAGGAGTAGAGCTACCAGTCTATGATATTACTAAG 

AAGCATTTAATATTGTCAGGAATGATGGGACATGTGGATCTCTATAAGGGCACTC 

TAAAGATGTGGAAACATGAGGGCTTTTTTGC^CTCTATAAAGGATTTTGGCCAAACTGGCTTCGGCT 

TGGACCCTGGAACATCATTTTTTTTATTACATACGAGCAGGTAAAGAGGCTTCAAATCGTCGACGGC 




ORF Start: at 2 |ORF Stop: end of sequence 
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SEQ ID NO: 146 |290 aa |MW at 32429.9kD 


NOV35b, 
309326356 
Protein Sequence 


TGSTMGIFPGI I L I FLRVKFATAAVIHQKSTTVSHEMSGLNWKPFVYGGLAS I VAEFGTFFVDLTKT 
RLQVQGQSIDARFKEIKYRGMFHALFRICKEEGVLALYSGIAPALLRQASYGTIKIGIYQSLKRLFV 
ERLEDETLLINMICGWSGVISSTIANPTDVLKIRMQAQGSLFQGSMIGSFIDIYQQEGTRGLWRGV 
VPT AQRAAI WGVEL PVYDI TKKHLIL SGMMGHVDLYKGTVDG I LKMWKHEGFFAL YKGFWPNWLRL 
GPWNIIFFITYEQVKRLQIVDG 





SEQ ID NO: 147 ^ 


811 bp 




NOV35c, 
309326444 DNA 
Sequence 


CACCGGATCCGCCGTGATTCACCAGAAAAGTACCACTGTAAGTCATGAGATGTCTGGTCTGAATTGG 
AAACCCTTTGTATATGGCGGCCTTGCCTCTATCGTGGCTGAGTTTGGGACTTTCCCTGTGGACCTTA 
CCAAAACACGACTTCAGGTTCAAGGCCAAAGCATTGATGCCCGTTTCAAAGAGATAAAATATAGAGG 
GATGTTCCATGCGCTGTTTCGCATCTGTAAAGAGGAAGGTGTATTGGCTCTCTATTCAGGAATTGCT 
CC TGCGT TGC TAAGAC AAGCATCATATGG CACC ATTAAAATTGGG ATTTACC AAAGCTTGAAGCGCT 
TATTCGTAGAACGTTTAGAAGATGAAACTCTTTTAATTAATATGATCTGTGGGGTAGTGTCAGGAGT 
GATATCTTCCACTATAGCCAATCCCACCGATGTTCTAAAGATTCGAATGCAGGCTCAAGGAAGCTTG 
TTCCAAGGG AGC ATG ATTGG AAGC TTTATCG AT AT ATACC AACAAGAAGGC AC C AGGGGTC TGTG GA 
GGGGTGTGGTTCC AAC TGCTCAGCGTGCTGCC ATCGTTGTAGG AGTAGAGCTAC CAGTCTATGATAT 
TACTAAGAAGCATTTAATATTCTCAGGAATGATGGGACATGTGGATCTCTATAAGGGCACTGTTGAT 
GGTATTTTAAAGATGTGGAAACATGAGGGCTTTTTTGCACTCTATAAAGGATTTTGGCCAAACTGGC 
TTCGGCTTGGACCCTGGAACATCATTTTTTTTATTACATACGAGCAGGTAAAGAGGCTTCAAATCGT 
CGACGGC 




ORF Start: at 2 ]ORF Stop: end of sequence 





SEQ ID NO: 148 |270 aa |MW at 30239. lkD 


NOV35c, 
309326444 
Protein Sequence 


TGS AVIHQKSTTVSHEMSGLNWKPFVYGGIiAS IVAEFGTFPVDLTKTRLQVQGQS IDARFKE I KYRG 
MFHALFRICKEEGVLALYSGIAPALLRQASYGTIKIGIYQSLKRLFVERLEDETLLINMICGWSGV 
I SSTI ANPTDVLK IRMQAQGSLFQGSMIG SF IDI YQQEGTRGLWRGWPTAQRAAIWGVELPVYDI 
TKKHLI LSGMMGHVDL YKGTVDGILKMWKHEGFFALYKGFWPNWLRIjGPWNI IFF I T YEQVKRLQI V 
DG 





SEQ ID NO: 149 


|761 bp I 


NOV35d, 
309326473 DNA 
Sequence 


CACCGGATCCCTGAATTGGAAACCCTTTGTATATGGCGGCCTTGCCTCTATCGTGGCTGAGTTTGGG 
ACTTTCC CTGTGGACCTTACC AAAACACGAC TTCAGGTTCAAGGCCAAAGC ATTGATGCCCGTTTC A 
AAGAGATAAAATATAGAGGGATGTTCCATGCGCTGTTTCGCATCTGTAAAGAGGAAGGTGTATTGGC 
TCTCTATTC AGGAATTGC TCCTGCGTTGC TAAGACAAGC ATC ATATGGCACC ATTAAAATTGGGATT 
TACCAAAGCTTGAAGCGCTTATTCGTAGAACGTTTAGAAGATGAAACTCTTTTAATTAATATGATCT 
GTGGGGTAGTGTCAGGAGTGATATCTTCCACTATAGCCAATCCCACCGATGTTCTAAAGATTCGAAT 
GCAGGCTCAAGGAAGCTTGTTCCAAGGGAGCATGATTGGAAGCTTTATCGATATATACCAACAAGAA 
GGCACCAGGGGTCTGTGGAGGGGTGTGGTTCCAACTGCTCAGCGTGCTGCCATCGTTGTAGGAGTAG 
AGCTACCAGTCTATGATATTACTAAGAAGCATTTAATATTGTCAGGAATGATGGGACATGTGGATCT 
CTATAAGGGCACTGTTGATGGTATTTTAAAGATGTGGAAACATGAGGK^TTTTTTGCACTCTATAAA 
GG ATT TTGGCC AAAC TGGC TTCGGC TTGG ACCC TGG AAC ATC ATTTTTTT TATTAC ATACGAGCAGG 
TAAAGAGGCTTCAAATCGTCGACG 




ORF Start: at 2 


[ORF Stop: end of sequence 
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SEQ ID NO: 150 |254 aa |MW at 28488.2kD 


NOV35d, 
309326473 
Protein Sequence 


TGSLNWKPFVYGGLAS IVAEFGTFPVDLTKTRLQVQGQS IDARFKEIKYRGMFHALFRICKEEGVLA 
LYSGIAPALLRQASyGTIKIGIYQSLKRLFVERLEDETLLINMICGWSGVISSTIANPTDVLKIRM 
QAQGSLFQGSMIGSFIDIYQQEGTRGLWRGVVPTAQRAAIWGVELFVYDITKKHLILSGMMGHVDL 
YKGTVDGILKMWKHEGFFALYKGFWPNmJILGPWNIIFFITYEQVKRLQIVDX 





SEQ ID NO: 151 |l019bp | 


NOV35e, 
CG149964-02 
DNA Sequence 


CTACCCAGAGGGTGAATGGGTATCTTTCCCGGAATAATCCTAATTTTTCTAAGGGTGAAGTTTGCAA 
CG G C GGC CGTG AC TG TAAGCGGAC ACC AGAAAAGTAC CAC TGTAAGTCATG AGATGTCTGGTC TG AA 
TTGGAAACCCTTTGTATATGGCGGCCTTGCCTCTATCGTGGCTGAGTTTGGGACTTTCCCTGTGGAC 
CTTACCAAAACACGACTTCAGGTTCAAGGCCAAAGCATTGATGCCCGTTTCAAAGAGATAAAATATA 
G AGG G ATGTTCC ATG C GC TGTTTCG CATC TGTAAAG AGG AAGGTGT ATTGGCTC TC TATTCAGGAAT 
TGCTCCTGCGTTGCTAAGACAAGCATCATATGGCACCATTAAAATTGGGATTTACCAAAGCTTGAAG 
CGCTTATTCGTAGAACGTTTAGAAGATGAAACTCTTTTAATTAATATGATCTGTGGGGTAGTGTCAG 
GAG TG AT ATC TTCCAC T AT AGCCAATC C C ACCGATGTTC TAAAGATTCGAATGCAGGCTC AAGG AAG 
CTTGTTCCAAGGGAGCATGATT<^AAGCTTTATCGATATATACCAGCAAGAAGGCACCAGGGGTCTG 
TGGAGGGGTGTGGTTC C AACTGC TC AGCGTGCTGCCATCGTTGTAGGAGTAGAGCTACCAGTCTATG 
ATATTACTAAGAAGCATTTAATATTGTGAGGAATGATGGGCGATA 

C^GCTTTACATGTGGTTTGGCT<^GGCTCTGGCCTCCAACCCGGTTGATGTGGTTCGAACTCGCATG 
ATGAACCAGAGGGCAATCGTGGGACATGTGGATCTCTATAAGGGCACTGTTGATGGTATTTTAAAGA 
TGTGGAAACATGAGGGCTTTTTTGCAC TCTATAAAGGATTTTGGCCAAAC TGGC TTCGGCTTGGACC 
CTGG AAC ATC ATTTTT TTT ATTAC ATACG AGCAGGTAAAGAGGC TTC AAATC TAAG AAC TGAATTAT 
ATGTG AGCC C AGCC 




ORF Start: ATG at 16 |ORF Stop: TAA at 991 





SEQ ID NO: 152 |325 aa |MW at 36175.2kD 


NOV35e, 
CG149964-02 
Protein Sequence 


MG I FPG 1 1 L I FLRVKFATAAVWSGHQKSTTVSHEMSGLNWKPFVYGGLAS IVAEFGTFPVDLTKTR 
LQVQGQS I DARFKE IKYRGMFHALrFRICKEEGVLALYSG I APALLRQAS YGTIKIG I YQ SLKRLFVE 
RLEDETLL INMI CGWSGVI SSTI ANPTDVLK IRMQAQGSLFQGSMIGSF IDIYQQEGTRGLWRGW 
PTAQRAAI WGVEL PVYD I TKKHL ILSGMMGDTI LTHFVSSFTCGLAGAL ASNPVDVVRTRMMNQRA 
I VGHVDL YKGTVDG I LKMWKHEGFFAL YKGFWPNWLRLGPWNI I FF I T YEQVKRLQ I 



Sequence comparison of the above protein sequences yields the following sequence 
relationships shown in Table 35B. 



Table 35B. Comparison of NOV35a against NOV35b through NOV35e. 


Protein Sequence 


NOV35a Residues/ 
Match Residues 


Identities/ 

Similarities for the Matched Region 


NOV35b 


1..286 
5..287 


282/286 (98%) 
282/286 (98%) 


NOV35c 


26..286 
1.261 


261/261 (100%) 
261/261 (100%) 
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NOV35d 


39..286 
4..251 


i p cr «- f p umii:^ -jt at a 4 
248/248 (100%) 

248/248(100%) 


NOV35e 


1..286 
1..325 


286/325 (88%) 
286/325 (88%) 



Further analysis of the NOV35a protein yielded the following properties shown in 
Table 35C. 
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Table 35C. Protein Sequence Properties NOV35a 


PSort analysis: 


0.4600 probability located in plasma membrane; 0.2648 probability located in 
microbody (peroxisome); 0.1000 probability located in endoplasmic reticulum 
(membrane); 0.1000 probability located in endoplasmic reticulum (lumen) 


SignalP analysis: 


Cleavage site between residues 20 and 21 



A search of the NOV35a protein against the Geneseq database, a proprietary 
10 database that contains sequences published in patents and patent publication, yielded 
several homologous proteins shown in Table 35D. 



Table 35D. Geneseq Results for NOV35a 


Geneseq 
Identifier 


Protein/Organism/Length 
[Patent*, Date] 


NOV35a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for 
the Matched 
Region 


Expect 
Value 


AAY94665 


Human uncoupling protein 
(UCP5) amino acid sequence 
- Homo sapiens, 325 aa. 
[WO200032624-A2, 
08-JUN-2000] 


1..286 
1..325 


284/325 (87%) 
285/325 (87%) 


e-158 


ABG33878 


Human secreted protein 
encoded by gene 16 - Homo 
sapiens, 334 aa. 
[WO200226931-A2, 
04-APR-2002] 


1..286 
1..334 


284/334 (85%) 
285/334 (85%) 


e-155 


AAE06056 


Human gene 16 encoded 
secreted protein HMIAP86, 
SEQIDNO:118-Homo 
sapiens, 334 aa. 
[WO200151504-A1, 
19-JUL-2001] 


1..286 
1..334 


284/334 (85%) 
285/334 (85%) 


e-155 
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AAY87079 


Human secreted protein 
sequence SEQ ID NO: 118- 
Homo sapiens, 335 aa. 
[WO200004140-A1, 
27-JAN-2000] 


^ 

1..286 
1..334 


284/334(85%) 
285/334(85%) 


.m«1> «!L .>•*" « 

e-155 


AAY94666 


Human uncoupling protein 
isoform hUCP5S amino acid 
sequence - Homo sapiens, 
322 aa. [WO200032624-A2, 
08-JUN-2000] 


1..286 
1..322 


281/325(86%) 
282/325 (86%) 


e-154 


In a BLAST search of public sequence datbases, the NOV35a protein was found to 
have homology to the proteins shown in the BLASTP data in Table 35E. 


Table 35E. Public BLASTP Results for NOV35a 


Protein 

Accession 

Number 


Protein/Organism/Length 


NOV35a 
Residues/ 
Match 
ivesiuues 


Identities/ 
Similarities for 
the Matched 

Pnrtfnn 


Expect 
Value 


095258 


Brain mitochondrial carrier 
protein-1 (BMCP-1) 
(Mitochondrial uncoupling 
protein 5) (UCP 5) (Solute 
carrier family 25, member 14) 
- Homo sapiens (Human), 325 
aa. 


1..286 
1..325 


284/325 (87%) 
285/325 (87%) 


e-157 


Q9Z2B2 


Brain mitochondrial carrier 
protein-1 (BMCP-1) 
(Mitochondrial uncoupling 
protein 5) (UCP 5) (Solute 
carrier family 25, member 14) 
- Mus musculus (Mouse), 325 
aa. 


1..286 
1..325 


276/325 (84%) 
283/325 (86%) 


e-154 


Q9EP88 


Brain mitochondrial carrier 
protein BMCP1 (Brain 1 
mitochondrial carrier 
protein-1) - Rattus norvegicus 
(Rat), 325 aa. 


1..286 
1..325 


274/325 (84%) 
282/325 (86%) 


e-153 


Q9JMH0 


Brain mitochondrial carrier 
protein-1 - Rattus norvegicus 
(Rat), 322 aa. 


1..286 
1..322 


271/325 (83%) 
279/325 (85%) 


e-149 


Q8R206 


Similar to RIKEN cDNA 
4933433D23 gene -Mus 
musculus (Mouse), 210 aa. 


36..232 
1..197 


160/197(81%) 
176/197 (89%) 


le-87 
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PFam analysis predicts that the NOV35a protein contains the domains shown in the 
Table 35F. 
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Table 35F. Domain Analysis of NO V35a 


Pfam Domain 


NO V35a Match Region 


Identities/ 
Similarities 

for the Matched Region 


Expect Value 


mito_carr 


39.. 138 


39/126 (31%) 
78/126 (62%) 


5.7e-24 


mito_carr 


140..231 


29/125 (23%) 
76/125 (61%) 


4.4e-27 


mito_carr 


233-286 


24/125 (19%) 
46/125 (37%) 


0.0072 



Example 36. 

The NOV36 clone was analyzed, and the nucleotide and encoded polypeptide 
10 sequences are shown in Table 36A. 



Table 36A. NOV36 Sequence Analysis 




SEQIDNO:153 |ll44bp | 


NOV36a, 
CG150306-01 
DNA Sequence 


CGCGGGGCGCGCGGCGCGGGGCGGCCTGGCCGGCGGCGGCGGCGGCATGAAGGTCACGTCGCTCGAC 


GGGCGCCAGCTGCGCAAGATGCTCCGCAAGGAGGCGGCGGCGCGCTGCGTGGTGCTCGACTGCCGGC 
CCTATCTGGCCTTCGCTGCCTCGAACGTGCGCGGCTCGCTCAACGTCAACCTCAACTCGGTGGTGCT 
GGACCAGGGCAGCCGCCACTGGCAGAAGCTGCGAGAGGAGAGCGCCGCGCGTGTCGTCCTCACCTCG 
CTACTCGCTTGCCTACCCGCCGGCCCGCGGGTCTACTTCCTCAAAGGGGGATATGAGACTTTCTACT 
CGGAATATCCTGAGTGTTGCGTGGATGTAAAACCCATTTCACAAGAGAAGATTGAGAGTGAGAGAGC 
CCTCATCAGCCAGTGTGGAAAACCAGTGGTAAATGTCAGCTACAGGCCAGCTTATGACCAGGGTGGC 
CCAGTTGAAATCCTTCCCTTCCTCTACCTTGGAAGTGCCTACCATGCATCCAAGTGCGAGTTCCTCG 
CCAACTTGCACATCACAGCCCTGCTGAATGTCTCCCGACGGACCTCCGAGGCCTGCATGACCCACCT 
ACACTACAAATGOATCCCTGTGGAAGACAGCCACACGGCTGACATTAGCTCCC^CTTTCAAGAAGCA 
ATAGACTTCATTGACTGTGTCAGGGAAAAGGGAGGCAAGGTCCTGGTCCACTGTGAGGCTGGGATCT 
CCCGTTCACCCACCATCTGCATGGCTTACCTTATGAAGACCAAGCAGTTCCGCCTGAAGGAGGCCTT 
CGATTACATCAAGCAGAGGAGGAGCATGGTCTCGCCCAACTTTGGCTTCATGGGCCAGCTCCTGCAG 
TACGAATCTGAGATCCTGCCCTCCACGCCCAAC CCCCAGCCTCCCTCC TGCC AAGGGGAGGCAGCAG 
GCTC TTC ACTGATAGGCCATTTGCAGAC ACTGAGCC C TGAC ATGCAGGGTGCCTACTGCAC ATTCCC 
TGC C TCGGTGCTGGCACCGGTGCCTACCC ACTCAAC AGTCTCAGAGCTC AGCAGAAGCCCTGTGGCA 
ACC^nCACATCCTGCTAAAACTGGGATGGAGGAATCGGCCCAGCCCCAAGAGCAACTCTGATTTTTC 


T T'X^'X* 'X' 




ORF Start: ATG at 47 1 lORF Stop: TAA at 1088 



15 





SEQ ID NO: 154 |347 aa |MW at 38362.6kD 


NOV36a t 


MKVTSLIX3RQLRKMLRKEAAARCVVLDCRP 
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CG150306-01 
Protein Sequence 



ARWLTSLIACLPAGPRVYFLKGGYETC^ 

PAYPQGGPVEILPFLYLGSAYHASKCEFLANLHITALLNVSRRTSE1ACMTHLHYKWI PVEDSHTADI 
S SH FQEAI DF I DC VREKGGK VL VHCEAG I SRSPTI CMAYLMKTKQFRLKEAFD YIKQRR SMVS PNPG 
FMGQLLQYESEILPSTPNPQPPSCQGEAAGSSLIGHLQTLSPDMQGAYCTFPASVLAPVPTHSTVSE 
LSRSPVATATSC 



Further analysis of the NOV36a protein yielded the following properties shown in 
Table 36B. 



Table 36B. Protein Sequence Properties NOV36a 


PSort analysis: 


0.4811 probability located in mitochondrial matrix space; 0.4500 probability 
located in cytoplasm; 0.1892 probability located in mitochondrial inner 
membrane; 0.1892 probability located in mitochondrial intermembrane space 


SignalP analysis: 


No Known Signal Sequence Predicted 



A search of the NOV36a protein against the Geneseq database, a proprietary 
database that contains sequences published in patents and patent publication, yielded 
several homologous proteins shown in Table 36C. 



Table 36C. Geneseq Results for NOV36a 


Geneseq 
Identifier 


Protein/Organism/Length 
[Patent #, Date] 


NOV36a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for the 
Matched Region 


Expect 
Value 


ABB07842 


Amino acid sequence of 
protein identified by 
Swissprot Accn No. Q16690 
- Homo sapiens, 384 aa. 
[WO200220732-A2, 
14-MAR-2002] 


1..347 
1..384 


347/384 (90%) 
347/384 (90%) 


0.0 


AAB66440 


Human MAP-kinase 
phosphatase MKP-5 - Homo 
sapiens, 171 aa. 
[WO200102582-A1, 
ll-JAN-2001] 


116..286 
1-171 


171/171 (100%) 
171/171 (100%) 


le-97 


AAE06784 


Human dual-specificity 
phosphatase (DSP) protein, 
MKP-5 - Homo sapiens, 171 
aa. [WO200157221-A2, 
09-AUG-2001] 


116..286 
1-171 


171/171 (100%) 
171/171 (100%) 


le-97 
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AAR63602 


MAP-kinase-phosphatase 
CL100 - Homo sapiens, 367 
aa. [WO9423039-A, 
13-OCT-1994] 


1..347 a 
3..367 


168/388 (43%) 
220/388 (56%) 


5e-72 


AAU84270 


Human endometrial cancer 
related protein, DUSP1 - 
Homo sapiens, 367 aa. 
[WO200209573-A2, 
07-FEB-2002] 


1..347 
3..367 


167/388 (43%) 
219/388 (56%) 


le-70 


In a BLAST search of public sequence datbases, the NOV36a protein was found to 
have homology to the proteins shown in the BLASTP data in Table 36D. 


Table 36D. Public BLASTP Results for NOV36a 


Protein 

Accession 

Number 


Protein/Organism/Length 


NOV36a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for 
the Matched 
Portion 


Expect 
Value 


Q16690 


Dual specificity protein 
phosphatase 5 (EC 3.1.3.48) 
(EC 3.1.3.16) (Dual 
specificity protein 
phosphatase hVH3) - Homo 
sapiens (Human), 384 aa. 


1..347 
1..384 


347/384 (90%) 
347/384 (90%) 


0.0 


054838 


Dual specificity protein 
phosphatase 5 (EC 3.1.3.48) 
(EC 3.1.3.16) (MAP-kinase 
phosphatase CPG21) - Rattus 
norvegicus (Rat), 384 aa. 


1..347 
1..384 


320/384 (83%) 
336/384 (87%) 


0.0 


Q90W58 


MAP kinase phosphatase 
XCL100(beta) protein - 
Xenopus laevis (African 
clawed frog), 369 aa. 


13..347 
15..369 


164/378 (43%) 
217/378 (57%) 


9e-72 


P28562 


Dual specificity protein 
phosphatase 1 (EC 3.1.3.48) 
(EC 3.1.3.16) (MAP kinase 
phosphatase-1) (MKP-1) 
(Protein-tyrosine phosphatase 
CL100) (Dual specificity 
protein phosphatase hVHl) - 
Homo sapiens (Human), 367 
aa. 


L.347 
3..367 


167/388 (43%) 
219/388 (56%) 


3e-70 
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042253 


Dual specificity protein 


15..344 


re6/366 (45%) 


p r 5157i 
le-68 




phosphatase 1 (EC 3.1.3.48) 


4..353 


213/366 (57%) 






(EC 3.1.3.16) (MAP kinase 










phosphatase-1) (MPK-1) 










(MAP kinase phosphatase-1) - 










Gallus gallus (Chicken), 353 










aa (fragment). 









PFam analysis predicts that the NOV36a protein contains the domains shown in the 
Table 36E. 
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Table 36E. Domain Analysis of NOV36a 


Pfam Domain 


NOV36a Match Region 


Identities/ 
Similarities 
for the Matched 
Region 


Expect Value 


Rhodanese 


7..98 


23/134(17%) 
66/134(49%) 


0.0052 


DSPc 


141..279 


76/172(44%) 
132/172(77%) 


1.8e-70 


Y_phosphatase 


44..279 


39/336(12%) 
144/336 (43%) 


0.54 



Example 37. 

10 The NOV37 clone was analyzed, and the nucleotide and encoded polypeptide 

sequences are shown in Table 37A. 



Table 37A. NOV37 Sequence Analysis 




SEQIDNO:155 |2277bp ! 


NOV37a, 
CG150510-01 
DNA Sequence 


CGCGTTGTGGGCTCCCGCCGGGGTCCCCCGCGGCTGTCGCCGCCGCCTACGCCGCTGCCTCCGCCTT 


CCTGCCCCGCGTCGGGCCGGGCGCCACCTCCCCCCTGCCTCCCTCTCCGCTGTGGTCATTTAGGAAA 


TCGTAAATCATGTGAAGATGGGACTCTTGGTATTTGTGCGCAATCTC 

TCTGGTACTGGGATTTTTGTATTATTCTGCGTGGAAGCTACACTTACTCCAGTGGGAGGAGGACTCC 
AGTAAGTATAGTCACTCTAGCTCACCCCAGGAGAAGCCTGTTGCAGATTCAGTGGTTCTTTCCTTTG 
ACTCCGCTGGACAAACACTAGGCTCAGAGTATGATCGGTTGGGCTTCCTCCTGAATCTGGACTCTAA 
AC TGCC TGC TG AATTAGC CACCAAGTACGCAAAC TTTTCAGAGGGAGC TTGCAAGCCTGGCTATGC T 
TCAGCCTTGATGACGGCCATOTTCCCCCGGTTCTCCAAGCCAGCACCCATGTTCCTGGATGACTCCT 
TTCGCAAGTGGGCTAGAATCCGGGAGTTCGTGCCGCCTTTTGGGATCAAAGGTCAAGACAATCTGAT 
CAAAGCCATCTTGTCAGTCACCAAAGAGTACCGCCTGACCCCTGCCTTGGACAGCCTCCGCTGCCGC 
CGCTGCATCATCGTGGGCAATGGAGGCGTTCTTGCCAACAAGTCTCTGGGGTCACGAATTGACGACT 
ATG AC ATTGTGGTGAGAC TG AATTC AGCAC C AGTG AAAGGCT TTGAG AAGG ACGTGGGC AGCAAAAC 
GACACTGCGCATCACCTACCCCGAGGGCGCCATGCAGCGGCCTGAGCAGTACGAGCGCGATTCTCTC 
TTTGTCCTCGCCGGCTTCAAGTGGCAGGACTTTAAGTGGTTGAAATACATCGTCTACAAGGAGAGAG 
TGAGTGC ATCGGATGGCTTC TGGAAATCTGTGGCC AC TCGAGTGCCC AAGGAGCCCCCTGAGATTCG 
AAT C C TCAAC CCAT ATTTC ATCCAGG AGGC C GC CTTCAC CC TCATTG GC CTGC CC T TCAAC AATGGC 
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CTCAT(K3GCCGGGGGAACATCCCTACCCTTGG£A^ 

ACGAGGTGGCAGTCGCAGGATTTGGCT ATGACATGAGCAC AC CCAACGC ACCCCTGC AC TAC TATGA 
GACCGTTCGCATGGCAGCCATCAAAGAGTCCTGGACGCACAATATCCAGCGAGAGAAAGAGTTTCTG 

GCCATAGAGGCCCAGGCACCACCAGGAGCAGCAGCCAGCACCACCTACACAGGAGTCTTCAGACCCA 


GAGAAGGACGGTGCCAAGGGCCCCAGGGGCAGCAAGGCCTTGGTGGAGCAGCCAGAGCTGTGCCTGC 


TCAGCAGCCAGTCTCAGAGACCAGCACTCAGCCTCATTCAGCATGGGTCCTTGATGCCAGAGGGCCA 


GCAGGCTCCTGGCTGTGCCCAGCAGGCCCAGCATGCAGGTGGTGGGACACTGGGCAGCAAGGCTGCT 


GCCGGAATCACTTCTCCAATCAGTGTTTGGTGTATTATCATTTTGTGAA 


TAGGGATAATTTATTTTTAAATAAGGTTGGAGATGTCAAGTTGGGTTCACTTGCCATGCAGGAAGAG 


GCCC AC TAGAGGGCC C ATC AGGC AGTGTTACCTGTTAGCTC CCTGTGGGGC AGGAGTGCCAGGACC A 


GCCTGTACCTTGCTGTGGGGCTACAGGATGGTGGGCAGGATCTCAAGCCAGCCCCCTCCAGCTCATG 


ACACTGTTTGGCCTTTCTTGGGGAGAAGGCGGGGTATTCCCACTCACCAGCCCTAGCTGTCCCATGG 


GGAAACCCTGGAGCCATCCCTTCGGAGCCAACAAGACCGCCCCAGGGCTATAGCAGAAAGAACTTTA 


AAGCTCAGGAGGGTGACGCCCAGCTCCGCCTGCTGGGAAGAGCTCCCCTCCACAGCTGCAGCTGATC 


CATAGGACTACCG(^GGCCCGGACTCACCAACTTGCCACATGTTCTAGGTTTCAGCAACAAqACTgC 


CAGGTGGTTGGGTTCTGCCTTTAGCCTGGACCAAAGGGAAGTGAGGCCCAAGGAGCTTACCCAAGCT 


GTGGCAGCCGTCCCAGGCCACCCCCATGC3AAGCAATAAAGCTCTTCCCTGTAAAAAAAAAAAAAAA 




ORF Start: ATG at 152 | |ORF Stop: TGA at 1322 





SEQ ID NO: 156 390 aa iMW at 43785.1kD 


NOV37a, 
CG150510-01 
Protein Sequence 


MGLLVWRNLLLALCLFI,VLGFLYYSA^^ 

LG S E YDRLGFLLNLDSKL PAEL ATKYANF SEGACKPGYAS ALMTAI FPRFSKPAPMFLDDSFRKWAR 
IREFVP PFG IKGQDNL I KAI LSVTKEYRLTP ALDSLRCRRC I IVGNGG VLANKSLG SRI DDYD IWR 
LNSAPVKGFEKDVGSKTTLRITYPEGAMQRPEQYERDSLFVLAGFKWQDFKWLKYIVYKERVSASDG 
FWKS VATRVPKEP PE IRILNP YF I QEAAFTL IGLPFNNGLMGRGNI PTLG SV AVTMALHGCDEVAVA 
GFG YDMST PNAPLHYYETVRMAAI KESWTHN IQREKEFLRKLVKARVITDL S SG I 



Further analysis of the NOV37a protein yielded the following properties shown in 
Table 37B. 
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Table 37B. Protein Sequence Properties NOV37a 


PSort analysis: 


0.8200 probability located in outside; 0.2360 probability located in microbody 
(peroxisome); 0.1900 probability located in lysosome (lumen); 0.1000 
probability located in endoplasmic reticulum (membrane) 


SignalP analysis: 


Cleavage site between residues 22 and 23 



A search of the NOV37a protein against the Geneseq database, a proprietary 
15 database that contains sequences published in patents and patent publication, yielded 
several homologous proteins shown in Table 37C 



Table 37C. Geneseq Results for NOV37a 
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Geneseq 
Identifier 


Protein/Organism/Length 
[Patent #, Date] 


NOV37a ^ 
Residues/ 
Match 
Residues 


Similarities for 
the Matched 
Region 


-"H13T! 

Expect 
Value 


AAY39960 


Human alpha2-3 sialate 
transferase protein sequence - 
Homo sapiens, 375 aa. 
[JP11253163-A, 
21-SEP-1999] 


1..390 
1..375 


374/390 (95%) 
375/390 (95%) 


0.0 


AAR65242 


Human ST3N 
sialyltransferase - Homo • 
sapiens, 375 aa. 
[WO9504816-A, 
16-FEB-1995] 


1..390 
1..375 


374/390 (95%) 
375/390 (95%) 


0.0 


AAR63217 


Human 

alpha-2,3-sialyltransferase 
(WM16) - Homo sapiens 
(melanoma WM266-4 cells), 
375 aa. [WO9423021-A, 
13-OCT-1994] 


1..390 
1..375 


374/390(95%) 
375/390(95%) 


0.0 


AAR62808 


Alpha 2, 3-sialyl transferase - 
Homo sapiens, 375 aa. 
[JP06277052-A, 
04-OCT-1994] 


1..390 
1..375 


374/390 (95%) 
375/390 (95%) 


0.0 


AAR41671 


Rat sialyltransferase - Rattus 
rattus, 374 aa. 
[W09318157-A, 
16-SEP-1993] 


1..390 
1..374 


361/390(92%) 
370/390(94%) 


0.0 



In a BLAST search of public sequence datbases, the NOV37a protein was found to 
have homology to the proteins shown in the BLASTP data in Table 37D. 
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Table 37D. Public BLASTP Results for NOV37a 


Protein 

Accession 

Number 


Protein/Organism/Length 


NOV37a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for 
the Matched 
Portion 


Expect 
Value 


Q11203 


CMP-N-acetylneuraminate-beta-1 ,4-gal 
actoside alpha-2,3- sialyltransferase (EC 
2.4.99.6) (N-acetyllactosaminide 
alpha-2,3- sialyltransferase) (Gal 
beta-l,3(4) GlcNAc alpha-2,3 
sialyltransferase) (ST3N) 
(Sialyltransferase 6) - Homo sapiens 
(Human), 375 aa. 


1..390 
1..375 


374/390 (95%) 
375/390(95%) 


0.0 
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Q922X5 


Sialyltransferase (N-acetyllacosaminide 
alpha 2,3-sialyltransferase) - Mus 
musculus (Mouse), 374 aa. 


1.39T 4 - E 
1..374 


' C H It gHWHBB*-?' 1 ' 

&hm9m 

371/390(94%) 


igi -jl rru rrryi ,r 

"ifi /v ' ' 


Q9DBB6 


Sialyltransferase (N-acetyllacosaminide 
alpha 2,3- sialyltransferase) - Mus 
musculus (Mouse), 374 aa. 


1..390 
1..374 


360/390(92%) 
371/390(94%) 


0.0 


Q02734 


CMP-N-acetylneurarninate-beta-l,4-gal 
actoside alpha-2,3- sialyltransferase (EC 
2.4.99.6) (N-acetyllactosaminide 
alpha-2,3- sialyltransferase) (Gal 
beta-l,3(4) GlcNAc alpha-2,3 
sialyltransferase) (ST3N) 
(Sialyltransferase 6) - Rattus norvegicus 
(Rat), 374 aa. 


1..390 
1..374 


361/390(92%) 
370/390 (94%) 


0.0 


P97325 


CMP-N-acety Ineuraminate-beta- 1 ,4-gal 
actoside alpha-2,3- sialyltransferase (EC 
2.4.99.6) (N-acetyllactosaminide 
alpha-2,3- sialyltransferase) (Gal 
beta-l,3(4) GlcNAc alpha-2,3 
sialyltransferase) (ST3N) 
(Sialyltransferase 6) - Mus musculus 
(Mouse), 374 aa. 


1..390 
1..374 


359/390(92%) 
370/390 (94%) 


0.0 



PFam analysis predicts that the NOV37a protein contains the domains shown in the 
Table 37E. 
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Table 37E. Domain Analysis of NOV37a 


Pfam Domain 


NOV37a Match 
Region 


Identities/ 
Similarities 

for die Matched Region 


Expect Value 


Glyco_transf_29 


101..389 


108/324 (33%) 
270/324 (83%) 


3.2e-116 



Example 38. 

10 The NOV38 clone was analyzed, and the nucleotide and encoded polypeptide 

sequences are shown in Table 38A, 



Table 38A.NOV38 Sequence Analysis 




SEQE)NO:157 |l076bp f 


NOV38a, 


CCCTTATgAAGACGGGACATTTTGAAATAOTCACCATGCTSCTGOCAACCATGATTCTAGTGGACAT 
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CG150704-01 
DNA Sequence 



TTTCCAGGTGAAGGCTO 

ACGGACAGGATGGAAATTAAATACGTTCCCCAACTGCTAAAGGAGGAAAAAGCAAGCCACCAGCAAT 
TAGATACTGTGTGGGAAAATGCAAAAGCCAAATGGGCAGCCCGAAAGACTCAAATCTTTCTCCCTAT 
GAATTTTAAGGATAACCATGGAATAGCCCTGATGGCATATATTTCCGAAGCTCAAGAGCAAACTCCC 
TTTTAC C ATC TGTTC AGTG AAG C TGTG AAG ATGG C TGGCCAATCTCGAG AAGATTAT ATC TATGGC T 
TCCAGTTC AAAGCTTTCCACTT TTACCTC AC AAGAGCCCTGC AGTTGCTGAG AAAACC TTGTGAGGC 
CAGTTCCAAAACTGTGGTATATAGAACAAGCCAGGGCACTTCATTTACATTTGGAGGGCTAAACCAA 
GCCAGGTTTGGCCATTTTACCTTGGCATATTCAGCCAAACCTCAGGCTGCTAATGACCAGCTCACTG 
TGTTATCCATCTACACATGCCTTGGAGTTGACATTGAAAATTTTCTTGATAAAGAAAGTGAAAGAAT 
TACTTTAATACCTCTGAATGAGGTTTTTCAAGTGTCACAGGAGGGGGCTGGCAATAACCTTATCCTT 
CAAAGCATAAACAAGACCTGCAGCCATTATGAGTGTGCATTTCTAGGTGGACTAAAAACCGAAAACT 
GTATTGAGAACCTAGAATATTTTCAACCCATCTATGTCTACAACCCTGGTGAGAAAAACCAGAAGCT 
TGAAGACCATAGTG AGAAAAAC TGGAAGCTTG AAGACCATGGTGAGAAAAACC AGAAG CTTGAAGAC 
CATGCTCCAGGTCCAGTTCCTGTTCCAGGTCCCAAAAGCCATCCTTCTGCATCCTCGGGCAAACTGC 



GTAO 



ORF Start: ATG at 6 



ORF Stop: TAG at 1074 





SEQ ID NO: 158 |356 aa |MW at 4031 1.7kD 


NOV38a, 
CG150704-01 
Protein Sequence 


MKTGHFE I VTMLLATMI LVDI F QVKAEVLDMADNAFDDE YLKCTDRME IKYVPQLLKEEKASHQQLD 
TVWENAKAKWAARKTQIFLPMNFKDNHGIALMAYISEAQEQTPFYHLFSEAVKl^AGQSREDYiyGFQ 
FKAFHF YLTRALQLLRK PCEA S SKTVVYRTSQGT SFTFGGLNQARFGHFTLAYS AKPQAANDQLTVL 
SIYTCLGVDIENFLDKESERITLIPLNEVFQVSQEGAGNNLILQSINKTCSHYECAFLGGLKTENCI 
EmEYFQPIYVYNPGEKNQKLEDHSEKNV^EDHG 

PQFGMVIILISVSAINLFVAL _ 



Further analysis of the NOV38a protein yielded the following properties shown in 
Table 38B. 
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Table 38B. Protein Sequence Properties NOV38a 


PSort analysis: 


0.6850 probability located in endoplasmic reticulum (membrane); 0.6400 
probability located in plasma membrane; 0.4600 probability located in Golgi 
body; 0.1000 probability located in endoplasmic reticulum (lumen) 


SignalP analysis: 


Cleavage site between residues 27 and 28 i 



A search of the NOV38a protein against the Geneseq database, a proprietary 
15 database that contains sequences published in patents and patent publication, yielded 
several homologous proteins shown in Table 38C. 



Table 38C. Geneseq Results for NOV38a 


Geneseq 
Identifier 


Protein/Organism/Length 
[Patent #, Date] 


NOV38a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for 
the Matched 
Region 


Expect 
Value 
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5 



AAR41876 


Human HT6 - Homo sapiens, 

Zdv aa. [UrAZWZlO-A, 
23-SEP-1993] 


29..256 
I..LII 


827238 (34%) 


<m)1m MsnW j}" 41 

le-21 


AAW76806 


Human 

ADP-ribosyltransferase 
protein - Homo sapiens, 327 

aa. [US5834310-A, 

in vrn\/ iooqi 
lU-JNUV-iyyoj 


20..266 
31..287 


83/266(31%) i 
123/266 (46%) 


6e-21 


AAW76804 


Rabbit skeletal muscle 
ADP-ribosyltransferase 
protein - Oryctolagus 
cuniculus, 327 aa. 
[U bo 8343 10- A, 
10-NOV-1998] 


8..259 
6..280 


88/282 (31%) 
130/282(45%) 


le-20 


AAR37572 


Rabbit skeletal muscle 
ADP-ribosyltransferase - 
Oryctolagus cuniculus, 327 
aa. [USN7985698-N, 
01-MAY-1993] 


8..259 
6..280 


88/282 (31%) 
130/282 (45%) 


le-20 


ABB97573 


Novel human protein SEQ ID 
NO: 841 - Homo sapiens, 229 
aa. [WO200222660-A2, 
21-MAR-2002] 


29..163 
29.. 161 


59/137 (43%) 
76/137 (55%) 


le-20 


In a BLAST search of public sequence datbases, the NOV38a protein was found to 
have homology to the proteins shown in the BLASTP data in Table 38D. 


Table 38D. Public BLASTP Results for NOV38a 


Protein 

Accession 

Number 


Protein/Organism/Length 


NOV38a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for 
the Matched 
Portion 


Expect 
Value 


S62906 


mono-ADP-ribosyltransferase - 
human, 367 aa. 


1..356 
1..367 


356/367 (97%) 
356/367 (97%) 


0.0 


Q8WVJ7 


Hypothetical 42.7 kDa protein - 
Homo sapiens (Human), 378 
aa. 


1..356 
1..378 


355/378 (93%) 
355/378 (93%) 


0.0 


Q13508 


Ecto-ADP-ribosyltransferase 3 
precursor (EC 2.4.2.31) 
(NAD(P)(+)~ arginine 
ADP-ribosyltransferase 3) 
(Mono(ADP-ribosyl)transferas 
e 3) - Homo sapiens (Human), 
389 aa. 


1..356 
1..389 


355/389 (91%) 
355/389 (91%) 


0.0 
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unKJiown ^proiciii lur 
MGC: 14489) - Homo sapiens 
(Human), 389 aa. 


, 

1..JJU 

1..389 


U 11. r <UU'IUhli.j 
•?54/189 (91 %> 

354/389 (91%) 


' J$ X Jji J" 
00 

V/.v 


Q9GKV6 


Hypothetical 38.2 kDa protein - 
Macaca fascicularis (Crab 
eating macaque) (Cynomolgus 
monkey), 338 aa. 


31. .356 
1..338 


300/338 (88%) 
312/338 (91%) 


e-174 



PFam analysis predicts that the NOV38a protein contains the domains shown in the 
Table 38E. 



Table 38E. Domain Analysis of NOV38a 


Pf am Domain 


NOV38a Match Region 


Identities/ 
Similarities 

for the Matched Region 


Expect Value 


ART 


1.312 


164/340 (48%) 
312/340(92%) 


1.5e-200 



Example 39. 

The NOV39 clone was analyzed, and the nucleotide and encoded polypeptide 
sequences are shown in Table 39A. 



Table 39A. NOV39 Sequence Analysis 




SEQIDNO:159 |8350bp | 


NOV39a, 
CG150799-01 
DNA Sequence 


CAGGGAAAAGGGAACCTATGGAATGGTCATGGTGACTTTTGAGGTAGAGGGTGGCCCAAATCCCCCT 
GATGAAGATTTGAGTCCAGTTAAAGGAAATATCACCTTTCCCCCTGGCAGAGCAACAGTAATTTATA 
ACTTGACAGTACTCGATGACGAGGTACCAGAAAATGATGAAATATTTTTAATTCAACTGAAAAGTGT 
AGAAGGAGGAGCTGAGATTAACACCTCTAGGAATTCCATTGAGATCATCATTAAGAAAAATGATAGT 
CCCGTGAGATTCCTTCAGAGTATTTATTTGGTTCCTGAGGAAGACCACATACTCATAATTCCAGTAG 
TTCGTGGAAAGGACAACAATGGAAATCTGATTGGATCTGATGAATATGAGGTTTCAATCAGTTATGC 
TGTCACAACTGGGAATTCGACAGCACATGCCC^GCAAAATCTGGA 

ACAACTGTTGTTTTTCCACCTTTTATTCATGAATCTCACTTGAAATTTCAAATAGTTGATGACACCA 
CACCGGAGATTGCTGAATCGTTTCACATTATGTTACTAAAAGATACCTTACAGGGAGATGCTGTGCT 
AATAAGCCCTTCTGTTGTACAAGTCACCATTAAGCCAAATGATAAACCTTATGGAGTCCTTTCATTC 
AACAGTGTTTTGTTTGAAAGGACAGTTATAATTGATGAAGATAGAATATCAAGATATGAAGAAATCA 
CAGTGGTTAGAAATGGAGGAACCCATGGGAATGTCTCTGCGAATTGGGTGTTGACACGGAACAGCAC 
TGATCCCTCACCAGTAACAGCAGATATCAGACCGAGCTCTGGAGTTCTCCATTTTGCACAAGGGCAG 
ATGTTGGCAACMTTCCTCTTACTGTGGTTGATGATGATCTTCCAGAAGAGGCAGAAGCTTATCTAC 
TTCAAATTCTGCCTCATACAATACGAGGAGGTGCAGAAGTGAGCGAGCCAGCGGAGGATAGTGATGA 
TGTCTATGGCCTAATAACATTTTTTCCTATGGAAAACCAGAAGATTGAAAGCAGCCCAGGTGAACGA 
TACTTATCCTTGAGTTTTACAAGACTAGGAGGGACTAAAGGAGATGTGAGGTTGCTTTATTCTGTAC 
T TTACAT TCC TGCTGGAGCTGTGG AC CCCTTGC AAGCAAAAG AAGGCATC TTAAATATATCAAG GAG 
AAATGACCTCATTTTTCCAGAGCAAAAAACTCAAGTCACTACAAAATTACCAATAAGAAATGATGCA 
TTCTTTCAAAATGGAGCTGACTTTCTAGTACAGTTGGAAACTGT 

TAATCCCACCCATAAGCCCTAGATTTGGGGAAATCTGCAATATTTCTTTACTGGTTACTCO^CCAT 
TGCAAATGGAGAAATTGGCTTTCTCAGCAATCTTCCAATTATTTTGCATGAACCAGAAGATTTTGCT 
GCTGAAGTGGTATACATTCCCTTACATCXXK3ATGGAACTGA 

TGAAGCCCTCTGGCTTTAATTCAAAAGCAGTGACCCCGGATGATATAGGCCCCTTTAATGGCTCTGT 
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TTTGTTTTTATC^ 

ATGAATGAAACTGTAACACTTTCTCTAGACAGGGTTAACGTGGAAAACCAAGTGCTGAAATCTGGAT 
ATACTAGCCGTGACCTAATTATTTTGGAAAATGATGACCCTGGGGGAGTTTTTGAATTTTCTCCTGC 
TTCC AGAGG ACC C TATGTTATAAAAG AAGGAG AATCTGT AGAG CTCCACATC ATCCG ATC AAGGGGG 
TCCCTTGTTAAGCAGTTTCTAC AC TACCGAGTAGAGCC AAG AGATAGCAATGAATTC TATGGAAACA 
C GGGAG TACTAG AATTTAAACC TGG AGAAAGGGAG ATAGTGATCACCTTGCTAGC AAGATTGGATGG 
G ATACC AGAGTTGGATG AACAC TACTGGG TGG TCC TCAGCAGCC ACGGAGAACGGGAAAGC AAGTTG 
GGAAGTGCC ACCATTGTCAATATAACGATTCTGAAAAATGATG ATCC TCATGG C ATTATAGAATTTG 
TTTCTGATGGTCTAATTGTGATGATAAATGAAAGCAAAGGAGATGCTATCTATAGTGCTGTTTATGA 
TGTAGTAAGAAATCGAGGCAACTTTGGTGATGTTAGTGTATCATGGGTGGTTAGTCCAGACTTTACA 
CAAGATGTATTTCCTGTACAAGGGACTGTTGTCTTTGGAGATCAGGAATTTTCAAAAAATATCACCA 
TTTACTCCCTTCCAGATGAGATTCCAGAAGAAATGGAAGAATTTACCGTTATCCTACTGAATGGCAC 
TGGAGGAGCTAAAGTGGGAAATAGAACAACTGCAACTCTGAGGATTAGAAGAAATGATGACCCCATT 
TATTTTG C AGAACCTCGTGTAGTG AGGGTTC AGG AAGGTGAGACTGCC AACTT TAC AGTTC TC AGAA 
ATGGATCTGTTGATGTGACTTGCATGGTCCAGTATGCTACCAAGGATGGGAAGGCTACTGCAAGAGA 
GAGAGATTTCATTCCTGTTGAAAAAGGAGAAACGCTCATTTTTGAGGTTGGAAGTAGACAGCAGAGC 
ATATCCATATTTGTTAATGAAGATGGTATCCCGGAAACAGATGAGCCCTTTTATATAATCCTCTTGA 
ATTCAACAGGTGATACAGTAGTATATCAATATGGAGTAGCTACAGTAATAATTGAAGCTAATGATGA 
C CCAAATGGC ATTTTTTC TCTGGAGCC CATAGAC AAAGC AGTGGAAG AAGGAAAGAC TAATGCATTT 
TGGATTTTGAGGCACCGAGGATACTTTGGTAGTGTTTCTGTATCTTGGCAGCTCTTTCAGAATGATT 
CTGCTTTGCAGCCTGGGCAGGAGTTCTATGAAACTTCAGGAACTGTTAACTTCATGGATGGAGAAGA 
AGCAAAAC C AATC ATTCTCC ATGC TTTTCC AGAT AT^AATTCCTGAATTCAATGAATTTTATT TCC TA 
AAACTTGTAAACATTTCAGGTCCTGGGGGCCAGCTAGCAGAAACCAACCTCCAGGTGACAGTAATGG 
TTCCATTC AATG ATG ATCCC TT TGG AGTTTTTAT C TTGGATCCAG AGTGTTT AGAGAGAGAAGTGG C 
AG AAGATG TC C TGTC TG AAG ATGAT ATGTCTTATATTACC AACTTC ACC ATTTTGAGGC AGC AGGGT 
GTGTTTGGTGATGTACAACTGGGCTGGGAAATACTGTCCAGTGAGTTCCCTGCTGGTTTGCCACCAA 
TGATAGATTTTTTAC TGGTTGG AATTTTCCC C ACCACCGTGCATTTAC AAC AGCACATGCGGCGTCA 
CCACAGTGGAACGGATGCTTTGTACTTTACCGGACTAGAGGGTGCATTTGGGACTGTTAATCCAAAA 
TACCATCCCTCCAGGAATAATACAATTGCCAACTTTACATTCTCAGCTTGGGTAATGCCCAATGCCA 
ATACG AATGG AT TC ATT ATAGCGAAG GATGACGG TAATGGAAGCATC TAC TACGGGGTAAAAATAC A 
AACAAACGAATCCCATGTGACACTTTCCCTTCATTATAAAACCTTGGGTTCCAATGCTACATACATT 
G CC AAGAC AACAGTC ATG AAATATTTAG AAG AAAGTGT TTGGCTTCATCTACTAATTATC CTGGAGG 
ATGGT AT AATCG AATTC TACC TGGATGG AAATGC AATGC CC AGGGGAATC AAGAGTCTGAAAGGAG A 
AGCC ATTACTGACGG TC CTGGGAT AC TGAGAATTGGAGCAGGGATAAATGGCAATGACAG ATTTACA 
GGTCTGATGCAGGATGTGAGGTCCTATGAGCGGAAACTGACGCTTGAAGAAATTTATGAACTTCATG 
CCATGCCCGCAAAAAGTGATTTACACCCAATTTCTGGATATCTGGAGTTCAGACAGGGAGAAACTAA 
CAAATC ATTC AT TATTTC TGC AAG AGATGAC AATGACGAGGAAGGAGAAGAATT ATTC ATTC TTAAA 
CTAGTTTCTGTATATGGAGGAGCTCGTATTTCGGAAGAAAATACTACTGCAAGATTAACAATACAAA 
AAAGTGACAATGCAAATGGCTTGTTTGGTTTCACAGGAGCTTGTATACCAGAGATTGCAGAGGAGGG 
ATCAACCATTTCTTGTGTGGTTGAGAGAACCAGAGGAGCTCTGGATTATGTGCATGTTTTTTACACC 
A TT TC AC AG ATTG AAAC TG ATGG CAT TAATT ACC TTGTTGATG AC TT TGC TAATG CC AG TGG AAC T A 
TTACATTC CTTCC TTGGC AGAGATC AGAGGTTC TGAATATATATGTTCTTGATGATGATATTCC TGA 
ACTTAATGAGTATTTCCGTjSTGACATTGGTTTCTGCAATTCCTGGAGATGGGAAGCTAGGCTCAACT 
CCTACCAGTGGTGCAAGCATAGATCCTGAAAAGGAAACGACTGATATCACCATCAAAGCTAGTGATC 
ATCCATATGGCTTGCTGCAGTTCTCCACAGGGCTGCCTCCTCAGCCTAAGGACGCAATGACCCTGCC 
TGCAAGCAGCGTTCCACATATCACTGTGGAGGAGGAAGATGGAGAAATCAGGTTATTGGTCATCCGT 
GCACAGGGACTTCTGGGAAGGGTGACTGCGGAATTTAGAACAGTGTCCTTGACAGCATTCAGTCCTG 
AGGATTACCAGAATGTTGCTGGCACATTAGAATTTCAACCAGGAGAAAGATATAAATACATTTTCAT 
AAACATCACTGATAATTCTATTCCTGAACTGGAAAAATCTTTTAAAGTTGAGTTGTTAAACTTGGAA 
GGAGGAGTAGCTGAACTCTTTAGGGTTGATGGAAGTGGTAGTGCCAGTCTAGGAGTGGCTTCCCAAA 
TTCTAGTGACAATTGCAGCCTCTGACCACGCTCATGGCGTATTTGAATTTAGCCCTGAGTCACTCTT 
TGTCAGTGGAACTGAACCAGAAGATGGGTATAGCACTGTTACATTAAATGTTATAAGACATCATGGA 
ACTCTGTCTCCAGTGACTTTGCATTGGAACATAGACTCTGATCCTGATGGTGATCTCGCCTTCACCT 
CTGGCAACATCACATTTGAGATTGGGCAGACGAGCGCCAATATCACTGTGGAGATATTGCCTGACGA 
AGACC C AG AACTGGATAAGGC ATTCTCTGTGTCAGTCCTCAGTGTTTCCAGTGGTTCTTTGGGAGC T 
CATATTAATGCCACGTTAACAGTTTTGGCTAGTGATGATCCATATGGGATATTCATTTTTTCTGAGA 
AAAACAGACCTGTTAAAGTTGAGGAAGCAACCCAGAACATCACACTATCAATAATAAGGTTGAAAGG 
CCTCATGGGAAAAGTCCTTGTCTCATATGCAACACTAGATGATATGGAAAAACCACCTTATTTTCCA 
CCTAATTTAGCG AG AGC AAC TCAAGGAAGAGACT AT ATACC AGCTTC TGGATTTGCTCTTTTTGGAG 
CTAATCAGAGTGAGGCAACAATAGCTATTTCAATTTTGGATGATGATGAGCCAGAAAGGTCCGAATC 
TGTCTTTATCG AAC TACTC AAC TC TACTTTAGTAGCGAAAGTACAGAGTCGTTC AATTCC AAATTCT 
CCACGTCTTGGGCCTAAGGTAGAAACTATTGCGCAACTAATTATCATTGCCAATGATGATGCATTTG 
G AAC TCTTCAGC TC TCAGC ACC AATTGTCCGAGTGGC AGAAAATC ATGTTGG AC CC ATTATC AATGT 
GACTAGAACAGGAGGAGCATTTGCAGATGTCTCTGTGAAGTTTAAAGCTGTGCCAATAACTGCAATA 
GCTGGTGAAGATTATAGTATAGCTTCATCAGATGTGGTCTTGCTAGAAGGGGAAACCAGTAAAGCCG 
TGCC AAT ATATGTC ATTAATGATATC T ATCC TG AAC TGG AAG AATCTTTTCTTG TGC AACTG ATGAA 
TGAAACAACAGGAGGAGCCAGACTAGGGGCTTTAACAGAGGCAGTCATTATTATTGAGGCCTCTGAT 
GACCCCTATGGATTATTTGGTTTTCAGATTACTAAACTTATTGTAGAGGAACCTGAGTTTAACTCAG 
TGAAGGTAAACCTGCCAATAATTCGAAATTCTGGGACACTCGGCAATGTTACTGTTCAGTGGGTTGC 
CACCATTAATGGACAGCTTGCTACTGGCGACCTGCGAGTTGTCTCAGGTAATGTGACCTTTGCCCCT 
GGGGAAACCATTCAAACCTTGTTGTTAGAGGTCCTGGCTGACGACGTTCCGGAGATTGAAGAGGTTA 
TCC AAGTGC AAC TAACTGATGCCTC TGGTGGAGGTACTATTGGGTTAGATCGAATTGCAAAT ATTAT 
TATTC C TGCC AATGATGATCCTTATGGTACAGT AGCCTTTGCTCAGATGGTTTATCGTGTTC AAGAG 
CCTC TGGAAAGAAGTTCCTGTGCTAATATAAC TGTCAGGCGAAGCGGAGGGC ACTTTGGTCGGCTGT 
TGTTGTTCTACAGTACTTCCGACATTGATGTAGTGGCTCTGGCAATGGAGGAAGGTCAAGATTTACT 
GTCCTACT ATGAATC TCC AATTC AAGGGGTGCCTGACCCACTTTGGAGAACTTGGATGAATGTCTC T 
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CATTTTTCAGTGCTTCTGAGGGTCCCCAGTGTTTCTGGATGACATCATGGATCAGCCCAGCTGTCAA 
C AATTC AGACTTC TGG ACC TAG AGGAAAAAC ATG ACCAGGGT AGCATC TCTTTTTAGTGGTCAGGCT 
GTGGCTGGGAGTGACTATGAGCCTGTGACAAGGCAATGGGCCATAATGCAGGAAGGTGATGAATTCG 
CAAATCTCACAGTGTCTATTCTTCCTGATGATTTCCCAGAGATGGATGAGAGTTTTCTAATTTCTCT 
CCTTGAAGTTCACCTCATGAACATTTCAGCCAGTTTGAAAAATCAGCCAACCATAGGACAGCCAAAT 
ATTTCTACAGTTGTCATAGCACTAAATGGTGATGCCTTTGGAGTGTTTGTGATCTACAATATTAGTC 
C C AAT ACTTCCGAAG ATGGCTTATTTGTTGAAGTTC AGGAGC AGC CCC AAACC TTGGTGGAGC TGAT 
GATACACAGGACAGGGGGCAGCTTAGGTCAAGTGGCAGTCGAATGGCGTGTTGTTGGTGGAACAGCT 
ACTGAAGGTTTAGATTTTATAGGTGCTGGAGAGATTCTGACCTTTGCTGAAGGTGAAACCAAAAAGA 
C AGTC ATT TTAAC C ATCTTGG ATGACTC TGAACC AGAGGATG ACG AAAGTATC ATAGTT AGTTTGGT 
GTACACTGAAGGTGGAAGTAGAATTTTGCCAAGCTCCGACACTGTTAGAGTGAACATTTTGGCCAAT 
GACAATGTGGCAGGAATTGTTAGCTTTCAGACAGCTTCCAGATCTGTCATAGGTCATGAAGGAGAAA 
TTTTACAATTCCATGTGATAAGAACTTTCCCTGGTCGAGGAAATGTTACTGTTAACTGGAAAATTAT 
TGGGCAAAATCTAGAACTCAATTTTGCTAACTTTAGCGGACAACTTTTCTTTCCTGAGGGGTCGTTG 
AATACAACATTGTTTGTGCATTTGTTGGATGACAACATTCCTGAGGAGAAAGAAGTATACCAAGTCA 
TTCTGTATGATGTCAGGACACAAGGAGTTCCACCAGCCGGAATCGCCCTGCTTGATGCT<^AGGATA 
TGC AGC TGTCC TC ACAGTAG AAGCCAGTG ATG AAC C AC ATGG AGTTTTAAATTTTGCTCTTTC ATC A 
AG ATTTGTGTT ACTAC AAG AGGCTAAC ATAACAATTC AGC TTTTC ATCAAC AGAGAATTTGGATC TC 
TAGGAGCTATCAATGTCACATATACCACGGTTCCTGGAATGCTGAGTCTGAAGAACCAAACAGTAGG 
AAACC T AGC AGAGCC AGAAGTTG ATTTTGTCCCTATC ATTGGCTTTC TGATTT TAGAAGAAGGGGAA 
ACAGCAGCAGCCATCAACATTACCATTCTTGAGGATGATGTACCAGAGCTAGAAGAATATTTCCTGG 
TGAATTT AAC TTACGTTGGACTTACC ATGGCTGCTTC AAC TTC ATTTCC TC CC AGAC TAGGTATGAG 
nr^TTTPTTnTTTnTTTrTTTTTnCTCACTTCAAATGAAATG AAGAAACTTCATTTTTGAATCAGAA 
GTGATCATTGTCCTOTTOTOTra 

jORF Start: ATG at 23 | \ORF Stop: TGA at 8282 



|SEQ ID NO: 160 |2753 aa iMW at 301743.8kD 


NOV39a, 
CG150799-01 
Protein Sequence 


MVMVTFEVEGG PNPPDEDL S P VKGNI TPP PGRATVI YNLTVLDDEVPENDE I FL IQLK S VEGGAE IN 
TSI^SIEIIIKKNDSPVRFLQSIYIiVPEEDHILIIPVVRGKDNNGNLIGSDEYEVSISYAVTTGNST 
AHAQQNLDF I DLQPNTTWFPPF IHESHLKFQI VDDTT PE I AES FH IMLLKDTLQGDAVLI S P SWQ 
VTIKPNDKPYGVLSFNSVLFERTVI IDEDR I SRYEE I TWRNGGTHGNVSANWVLTRNSTDPS PVTA 
D IRP SSGVLHF AQGQMLAT IPLT WDDDLPEEAEAYLLQI LPHT IRGG AEVSEPAEDSDDVYGL I TF 
F PMENQK I E S S PGERYLSL S FTRLGGTKGDVRLL YS VL YI PAGAVDPLQ AKEGILNI SRRNDL I FPE 
QKTQVTTKLPIRNDAFFQNGAHFIiVQLETVEIiLNI I PL I P P I S PRFGE ICN I SLLVT PAI ANGE IGF 
LSNLPIILHEPETFAAEWYIPLHRDGTDGQATVYWSLKPSGFNSKAVTPDDIGPFNGSVLFLSGQS 
DTTINITIKGDDIPEMNETVTLSLDRVNVENQVLKSGYTSRDLIILENDDPGGVFEFSPASRGPYVI 
KEGE SVELHI IR SRGSLVKQFLHYRVEPRDSNEF YGNTGVLE FK PGERE I VI TLLARLDGI PELDEH 
YWVVLSSHGERESKLGSATITOITILKNDDPHGIIEFVSDGLIVMINESKGDAIYSAVYPVVRNR 
FGDVSVSWWS PDFTQDVF P VQGTWFGDQEFSKN I T I YSLPDEI PEEMEEFTVILLNGTGGAKVGN 
RTT ATLR I RRNDDP I YFAE PRVVRVQEGET ANFTVLRNGSVDVTCMVQYATKDGKATARERDF I PVE 
KGETL I FEVG SRQQS I S I FVNEDG I PETDEPFYI I LLNSTGDTWYQYGVATVI I EANDDPNG I FSL 
EPIDKAVEEGKTNAFWILRHRGYFGSVSVSWQLFQNDSALQPGQEFYETSGTVNFMDGEEAKPIILH 
AFPDKIPEFNEFYFLKLWISGPGGQLAETNLQVTVMVPFNDDPFGWILDPECLEREVAEDVLSE^ 
DMS YITNFT I LRQQGVFGDVQLGWEI LS S EF PAGL PPM IDFLL VG IFPTTVHLQQHMRRHHSGTDAL 
YFTGLEGAFGTVNPKYHPSRI^TIANFTFSAWVMPNANTNGFI I AKDDGNGS I YYGVKIQTNESHVT 
LSLHYKTLGSNATYIAKTTVMKYLEESVVn^LLIILEDGIIEFYLDGNAMPRGIKSLKGEAITDGPG 
ILRIGAGINGNDRFTGLMQDVRSYERKLTLEEIYELHAMPAKSDLHPISGYIjEFRQGETNKSFIISA 
RDDNDEEGEELF I LKLVSVYGGARI SEENTTARLT I QK SDN ANGLFGFTG AC I PEI AEEGSTI SCW 
ERTRGALDYVHVFYTI SQI ETDGINYLVDDFANASGTI TFLPWQRSEVLNI YVLDDDI PELNEYFRV 
TLVSAIPGDGKLGSTPTSGASIDPEKETTDITIKASDHPYGLLQFSTGLPPQPKDAMTLPASSVPHI 
TVEEEDGE IRLL VI RAQGLLGRVTAEFRTVSLTAFS PEDYQNVAGTLEFQ PGERYKYI F INITDNS I 
PELEKSFKVELLNLEGGVAELFRVDGSGSASLGVASQILVTIAASDHAHGVFEFSPESLFVSGTEPE 
DG YSTVTLNV I RHHGTL S PVTLHWNIDSDPDGDL AFTSGN I TFEI GQTS ANI TVE I L PDEDPELDKA 
FSVSVLSVSSGSLGAHINATLTVLASDDPYGIF IFSEKNRPVKVEEATQNITLSI IRLKGLMGKVLV 
S YATLDDMEKPPYFPPNLARATQGRDYI PASGFALFGANQSEATI AI S ILDDDEPERSESVFI ELLN 
STLVAKVQSRS I PNS PRLGPKVET I AQL 1 1 1 ANDDAFGTLQLSAPI VRVAENHVGPI INVTRTGGAF 
ADVSVKFKAVPITAIAGEDYSIASSDVVLLEGETSKAVPIYVINDIYPELEESFLVQLMNETTGGAR 
LGALTEAVIIIEASDDPYGLFGFQITKLIVEEPEFNSVKVNLPIIKNSGTLGNVTVQWVATINGQLA 
TGDLRWSGNVTFAPGETIQTLLLEVLADDVPEIEEVIQVQLTDASGGGTIGLDRIANIIIPANDDP 
YGTVAFAQMVYRVQEPLERSSCANITVRRSGGHFGRLLLFYSTSDIDWALAMEEGQDLLSYYESPI 
QGVPDPLWRTWMNVS AVGEPLYTC ATLCLKEQACSAFS FFSASEGPQCFWMTS WI S PAVNNSDFWT Y 
RKNMTRVASLFSGQ AVAG SDYE PVTRQWAIMQEGDEFANLTVS I L PDDFPEMDESFL I SLLEVHLMN 
ISASLKNQPTIGQPNISTWIALNGDAFGVFVIYNISPNTSEDGLFVEVQEQPQTLVELMIHRTGGS 
LGQVAVEWRWGGTATEGLDF IGAGE ILTF AEGETKKTVILTI LDDS EPEDDES 1 1 VSLVYTEGG SR 
I L PS SDTVRVN I L ANDNVAGIVSFQTASRSVIGHEGE ILQFHVI RT F PGRGNVTVNWK I IGQNLELN 
FANFSGQLFFPEGSLNTTLFVHLLDDNIPEEKEVYQVILYIJVRTQGVPPAGIALLDAQGYAAVLTVE 
ASDEPHGVLNFALSSRFVIjLOEANITIOLFINREFGSLGAINVTYTWPGMLSIiKNOWGNLAEPE^ 
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DFVPIIG^ 

CSLQMK 



SEQ ID NO: 161 



i!lg25bp_ 1 



NOV39b, 
CG150799-02 
DNA Sequence 



CAGGGAAAAGGGAACCTATGGAA TGGTCATGGTGACTTTTGAGGTAGAGGGTGGCCCAAATCCCCCT 



GATGAAGATTTGAGTCCAGTTAAAGGAAATATCACCTTTCCCCCTGGCAGAGCAACAGTAATTTATA 
ACTTGACAGTACTCGATGACGAGGTACCAGAAAATGATGAAATATTTTTAATTCAACTGAAAAGTGT 
AGAAGGAGGAGCTGAGATTAACACCTCTAGGAATTCCATTGAGATCATCATTAAGAAAAATGATAGT 
CCCGTGAGATTCCTTCAGAGTATTTATTTGGTTCCTGAGGAAGACCACATACTCATAATTCCAGTAG 
TTCGTGGAAAGGACAACAATGGAAATCTGATTGGATC TGATGAATATGAGGTTTC AATCAGTTATGC 
TGTCACAACTGGGAATTCCACAGCACATGCCCAGCAAAATCTGGACTTCATTGATCTTCAGCCAAAC 
AC AAC TGTTGTTTTTCC ACCTTTTATTCATGAATC TC ACTTGAAATTTC AAATAGTTGATGACAC C A 
CACCGGAGATTGCTGAATCGTTTCACATTATGTTACTAAAAGATACCTTACAGGGAGATGCTGTGCT 
AATAAGCCCTTCTGTTGTACAAGTCACCATTAAGCCAAATGATAAACCTTATGGAGTCCTTTCATTC 
AACAGTGTTTTGTTTGAAAGGACAGTTATAATTGATGAAGATAGAATATCAAGATATGAAGAAATCA 
CAGTGGTTAGAAATGGAGGAACCCATGGGAATGTCTCTGCGAATTGGGTGTTGACACGGAACAGCAC 
TGATCCCTC ACC AGTAACAGCAGATATCAGACC GAGC TC TGGAGTTCTC CATTTTGC ACAAGGGC AG 
ATGTTGGCAACAATTCCTCTTACTGTGGTTGATGATGATCTTCCAGAAGAGGCAGAAGCTTATCTAC 
TTCAAATTCTGCCTCATACAATACGAGGAGGTGCAGAAGTGAGCGAGCCAGCGGAGGATAGTGATGA 
TGTCTATGGCCTAATAACATTTTTTCCTATGGAAAACCAGAAGATTGAAAGCAGCCCAGGTGAACGA 
TAC TTATCC TTGAGTTTTACAAGACTAGGAGGG AC TAAAGGAG ATGTGAGG TTGC TTTATTCTGT AC 
TTTACATTCCTGCTGGAGCTGTGGACCCCTTGCAAGCAAAAGAAGGCATCTTAAATATATCAAGGAG 
AAATGAC C TC ATTTTTCCAGAGC AAAAAACTCAAGTCAC TAC AAAATTACCAATAAGAAATGATGCA 
T TC TTTC AAAATGGAGCTC AC TTTCTAGTAC AGTTGGAAAC TGTGGAGTTGTTAAAC ATAATTCC TC 
TAATCCCACCCATAAGCCCTAGATTTGGGGAAATCTGCAATATTTCTTTACTGGTTACTCCAGCCAT 
TGCAAATGGAGAAATTGGCTTTCTCAGCAATCTTCCAATTATTTTGCATGAACCAGAAGATTTTGCT 
GCTGAAGTGGTATACATTCCCTTACATCGGGATGGAACTGATGGCCAGGCTACTGTCTACTGGAGTT 

T T TGTTT T TATC TGGGCAAAGTG AC ACAACAATC AACAT TAC TATC AAAGGTG ATGAC ATACCGG AA 
ATGAATGAAACTGTAACACTTTCTCTAGACAGGGTTAACGTGGAAAACCAAGTGCTGAAATCTGGAT 
ATACTAGCCGTGACCTAATTATTTTGGAAAATGATGACCCTGGGGGAGTTTTTGAATTTTCTCCTGC 
TTCCAGAGGACCCTATGTTATAAAAGAAGGAGAATCTGTAGAGCTCCACATCATCCGATCAAGGGGG 
TCCCTTGTTAAGCAGTTTCTACACTACCGAGTAGAGCCAAGAGATAGCAATGAATTCTATGGAAACA 
CGGGAGTACTAGAATTTAAACCTGGAGAAAGGGAGATAGTGATCACCTTGCTAGCAAGATTGGATGG 
GATACCAGAGTTGGATGAACACTACTGGGTGGTCCTCAGCAGCCACGGAGAACGGGAAAGCAAGTTG 
GGAAGTGCCACCATTGTCAATATAACGATTCTGAAAAATGATGATCCTCATGGCATTATAGAATTTG 
TTTC TGATGGTC T AATTGTGATGATAAATGAAAGC AAAGGAGATGCTATCTATAGTGCTGTTTATGA 
TGTAGTAAGAAATCGAGGCAACTTTGGTGATGTTAGTGTATCATGGGTGGTTAGTCCAGACTTTACA 
C AAGATGTATTTCC TGTACAAGGGACTGT TGTC TTTGGAGATCAGG AATTTTC AAAAAATATCACCA 
TTT ACTCC C TTCC AG ATGAGATTCCAGAAGAAATGGAAGAATTTACCGTTATCCT AC TG AATGGCAC 
TGGAGGAGCTAAAGTGGGAAATAGAACAACTGCAACTCTGAGGATTAGAAGAAATGATGACCCCATT 
TATTTTGCAGAACCTCGTGTAGTGAGGGTTCAGGAAGGTGAGACTGCCAACTTTACAGTTCTCAGAA 
ATGGATCTGTTGATGTGACTTGCATGGTCCAGTATGCTACCAAGGATGGGAAGGCTACTGCAAGAGA 
G AG AG ATTTC ATTCC TGTTG AAAAAGGAGAAACGC TC ATT TTTG AGGTTGGAAGTAG AC AGC AGAGC 
ATATCCATATTTGTTAATGAAGATGGTATCCCGGAAACAGATGAGCCCTTTTATATAATCCTCTTGA 
ATTCAACAGGTGATACAGTAGTATATCAATATGGAGTAGCTACAGTAATAATTGAAGCTAATGATGA 

TGGATTTTGAGGCACCGAGGATACTTTGGTAGTGTTTCTGTATCTTGGCAGCTCTTTCAGAATGATT 
C TGCTTTGC AGCC TGGGCAGGAGTTCTATGAAAC TTC AGGAAC TGTTAACTTC ATGGATGGAG AAGA 
AGCAAAACCAATCATTCTCCATGCTTTTCCAGATAAAATTCCTGAATTCAATGAATTTTATTTCCTA 
AAACTTGTAAACATTTCAGGTCCTGGGGGCCAGCTAGCAGAAACCAACCTCCAGGTGACAGTAATGG 
TTCCATTCAATGATGATCCCTTTGGAGTTTTTATCTTGGATCCAGAGTGTTTAGAGAGAGAAGTGGC 
AGAAGATGTCCTGTCTGAAGATGATATGTCTTATATTACCAACTTCACCATTTTGAGGCAGCAGGGT 
GTGTTTGGTGATGTACAACTGGGCTGGGAAATACTGTCCAGTGAGTTCCCTGCTGGTTTGCCACCAA 
TGATAGATTTTTTACTGGTTGGAATTTTCCCCACCACCGTGCATTTACAACAGCACATGCGGCGTCA 
C C AC AGTGG AACGG ATGC TT TGTACTTTACCGGACTAGAGGG TGC ATTTGGGACTGTTAATCC AAAA 
TACC ATCCC TCC AGG AATAATAC AATTGCC AACTTTAC ATTC TCAGCTTGGGTAATGCCC AATGCC A 
ATACGAATGGATTCATTATAGCGAAGGATGACGGTAATGGAAGCATCTACTACGGGGTAAAAATACA 
AAC AAACG AATCC C ATGTGAC ACTTTCC C TTC ATTATAAAAC C TTGGGTTCCAATGC TAC AT AC ATT 
GCCAAGACAACAGTCATGAAATATTTAGAAGAAAGTGTTTGGCTTCATCTACTAATTATCCTGGAGG 
ATGGTATAATCGAATTCTACCTGGATGGAAATGCAATGCCCAGGGGAATCAAGAGTCTGAAAGGAGA 
AGCCATTACTGACGGTCCTGGGATACTGAGAATTGGAGCAGGGATAAATGGCAATGACAGATTTACA 
GGTCTGATG(^GGATGTGAGGTCCTATGAGCGGAAACTGACGCTTGAAGAAATTTATGAACTTCATG 
CC ATGCC CGC AAAAAGTGATTTACACCCAATTTCTGGATATC TGGAGTTCAGACAGGGAGAAACT AA 
CAAATCIATTCATTATTTCTGCAAGAGATGACAATGACGAGGAAGGAGAAGAATTATTCATTCTTAAA 
CTAGTTTCTGTATATGGAGGAGCTCGTATTTCGGAAGAAAATACTACTGCAAGATTAACAATACAAA 
AAAGTGACAATGCAAATGGCTTGTTTGGTTTCACAGGAGCTTGTATACCAGAGATTGCAGAGGAGGG 
ATCAACCATTTCTTGTGTGGTTGAGAGAACCAGAGGAGCTCTGGATTATGTGCATGTTTTTTACACC 
ATTTCACAGATTGAAACTGATGGC^TTAATTACCTTGTTGATGACTTTGCTAATGCCAGTGGAACTA 
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TTAGATTTCCTTC 

ACTTAATGAGTATTTCCGTGTGACATTGGTTTCTGCAATTCCTGGAGATGGGAAGCTAGGCTCAACT 

CCTACCAGTGGTGCAAGCATAGATCCTGAAAAGGAAACGACTGATATCACCATCAAAGCTAGTGATC 

ATCCATATGGCTTGCTGCAGTTCTCCACAGGGCTGCCTCCTCAGCCTAAGGACGCAATGACCCTGCC 

TGCAAGCAGCGTTCCACATATCACTGTGGAGGAGGAAGATGGAGAAATCAGGTTATTGGTCATCCGT 

GCACAGGGACTTCTGGGAAGGGTGACTGCGGAATTTAGAACAGTGTCCTTGACAGCATTCAGTCCTG 

AGGATTACCAGAATGTTGCTGGCACATTAGAATTTCAACCAGGAGAAAGATATAAATACATTTTCAT 

AAAC AT C AC TG AT AAT TC T ATTC C TGAAC TGG AAAAATCTTTTAAAG TTGAG T TGTT AAAC TTGG AA 

GGAGG AGT AGCTG AAC TCTTTAGGGTTGATGGAAGTGGTAGTGCCAGTC TAGGAGTGGCTTCCC AAA 

TTCTAGTGACAATTGCAGCCTCTGACCACGCTCATGGCGTATTTGAATTTAGCCCTGAGTCACTCTT 

TGTCAGTGGAAC TG AACCAG AAG ATGGGTATAGCAC TGTTAC ATTAAATGTTATAAGACATC ATGG A 

AC TCTGTCTCCAGTG ACTTTGCATTGGAACAT AGAC TC TG ATC C TGATGGTGATCTCGCC TTC ACC T 

CTGGCAACATCACATTTGAGATTGGGCAGACGAGCGCCAATATCACTGTGGAGATATTGCCTGACGA 

AGACCCAGAACTGGATAAGGCATTCTCTGTGTCAGTCCTCAGTGTTTCCAGTGGTTCTTTGGGAGCT 

CATATTAATGCCACGTTAACAGTTTTGGCTAGTGATGATCCATATGGGATATTCATTTTTTCTGAGA 

AAAACAGACCTGTTAAAGTTGAGGAAGCAACCCAGAACATCACACTATCAATAATAAGGTTGAAAGG 

CCTCATGGGAAAAGTCCTTGTCTCATATGCAACACTAGATGATATGGAAAAACCACCTTATTTTCCA 

CCTAATTTAGCGAGAGCAACTCAAGGAAGAGACTATATACCAGCTTCTGGATTTGCTCTTTTTGGAG 

CTAATCAGAGTGAGGCAACAATAGCTATTTCAATTTTGGATGATGATGAGCCAGAAAGGTCCGAATC 

TGTCTTTATCGAACTACTCAACTCTACTTTAGTAGCGAAAGTACAGAGTCGTTCAATTCCAAATTCT 

CCACGTCTTGGGCCTAAGGTAGAT^CTATTGCGCAACTAATTATCATTGCCAATGATGATGCATTTG 

GAACTCTTCAGCTCTCAGCACCAATTGTCCGAGTGGCAGAAAATCATGTTGGACCCATTATCAATGT 

GACTAGAACAGGAGGAGCATTTGCAGATGTCTCTGTGAAGTTTAAAGCTGTGCCAATAACTGCAATA 

GCTGGTGAAGATTATAGTATAGCTTCATCAGATGTGGTCTTGCTAGAAGGGGAAACCAGTAAAGCCG 

TGC CAATATATGTC ATT AATGATATCTATCCTGAAC TGGAAGAATCTTTTC TTGTGC AACTGATGAA 

TGAAACAACAGGAGGAGCCAGACTAGGGGCTTTAACAGAGGCAGTCATTATTATTGAGGCCTCTGAT 

GACCCCTATGGATTATTTGGTTTTCAGATTACTAAACTTATTGTAGAGGAACCTGAGTTTAACTCAG 

TGAAGGTAAACCTGCCAATAATTCGAAATTCTGGGACACTCGGCAATGTTACTGTTCAGTGGGTTGC 

CACCATTAATGGACAGCTTGCTACTGGCGACCTGCGAGTTGTCTCAGGTAATGTGACCTTTGCCCCT 

GGGGAAACCATTCAAACCTTGTTGTTAGAGGTCCTGGCTGACGACGTTCCGGAGATTGAAGAGGTTA 

TCCAAGTGCAACTAACTGATGCCTCTGGTGGAGGTACTATTGGGTTAGATCGAATTGCAAATATTAT 

TATTCC TGCC AATGATG ATCCTTATGGTAC AGTAGC CTTTGC TCAGATGGTTTATCGTGTTC AAGAG 

CCTCTGGAAAGAAGTTCCTGTGCTAATATAACTGTCAGGCGAAGCGGAGGGCACTTTGGTCGGCTGT 

TGTTGTTCTACAGTACTTCCGACATTGATGTAGTGGCTCTGGCAATGGAGGAAGGTCAAGATTTACT 

GTC CTAC T ATGAATCTC CAATTCAAGGGGTGCCTGACCCACTTTGGAGAAC TTGGATGAATGTCTCT 

GCCGTGGGGGAGCCCCTGTATACCTGTGCCACTTTGTGCCTTAAGGAACAAGCTTGCTCAGCGTTT^ 

CATTTTTCAGTGCTTCTGAGGGTCCCCAGTGTTTCTGGATGACATCATGGATCAGCCCAGCTGTCAA 

CAATTCAGACTTCTGGACCTACAGGAAAAACATGACCAGGGTAGCATCTCTTTTTAGTGGTCAGGCT 

GTGGCTGGGAGTGACTATGAGCCTGTGACAAGGCAATGGGCCATAATGCAGGAAGGTGATGAATTCG 

CAAATCTC AC AGTGTC T ATTCTTCCTGATGATT TCCC AGAGATGGATGAGAGTTTTCTAATTTCTC T 

CCTTGAAGTTCACCTCATGAACATTTCAGCCAGTTTGAAAAATCAGCCAACCATAGGACAGCCAAAT 

ATTTCTACAGTTGTCATAGCACTAAATGGTGATGCCTTTGGAGTGTTTGTGATCTACAGTATTAGTC 

CCAATACTTCCGAAGATGGCTTATTTGTTGAAGTTCAGGAGCAGCCCCAAACCTTGGTGGAGCTGAT 

GATACACAGGACAGGGGGCAGCTTAGGTCAAGTGGCAGTCGAATGGCGTGTTGTTGGTGGAACAGCT 

ACTGAAGGTTTAGATTTTATAGGTGCTGGAGAGATTCTGACCTTTGCTGAAGGTGAAACCAAAAAGA 

CAGTCATTTTAACCATCTTGGATGACTCTGAACCAGAGGATGACGAAAGTATCATAGTTAGTTTGGT 

GTACACTGAAGGTGGAAGTAGAATTTTGCCAAGCTCCGACACTGTTAGAGTGAACATTTTGGCCAAT 

GACAATGTGGCAGGAATTGTTAGCTTTCAGACAGCTTCCAGATCTGTCATAGGTCATGAAGGAGAAA 

TTTTACAATTCCATGTGATAAGAACTTTCCCTGGTCGAGGAAATGTTACTGTTAACTGGAAAATTAT 

TGGGC AAAATCTAGAAC TC AATTTTGC TAACTTTAGCGGAC AACTTTTC TTTCCTGAGGGGTCGTTG 

AATACAACATTGTTTGTGCATTTGTTGGATGACAACATTCCTGAGGAGAAAGAAGTATACCAAGTCA 

TTCTGTATGATGTCAGGACACAAGGAGTTCCACCAGCCGGAATCGCCCTGCTTGATGCTCAAGGATA 

TGCAGCTGTCCTCACAGTAGAAGCCAGTGATGAACCACATGGAGTTTTAAATTTTGCTCTTTCATCA 

AGATTTGTGTTACTACAAGAGGCTAACATAACAATTCAGCTTTTCATCAACAGAGAATTTGGATCTC 

TAGGAGCTATCAATGTCACATATACCACGGTTCCTGGAATGCTGAGTCTGAAGAACCAAACAGTAGG 

AAACCTAGCAGAGCCAGAAGTTGATTTTGTCCCTATCATTGGCTTTCTGATTTTAGAAGAAGGGGAA 

AC AGC AGC AGCC ATC AAC ATTACC ATTC TTGAGGATG ATGTACCAGAGC TAGAAGAATATTTCC TGG 

TGAATTTAACTTACGTTGGACTTACCATGGCTGCTTCAACTTCATTTCCTCCCAGACTAGATTCAGA 

AGGTTTGACTGCACAAGTTATTATTGATGCCAATGATGGGGCCCGAGGTGTAATTGAATGGCAACAA 

AGCAGGTTTGAAGTAAATGAAACCCATGGAAGTTTAACATTGGTAGCCCAGAGGAGCAGAGAACCTC 

TTGGCCATGTTTCCTTATTTGTGTATGCTCAGAATTTGGAAGCACAAGTGGGGCTGGATTATATCTT 

CACCCCAATGATTCTTCATTTTGCTGATGGAGAAAGGTATAAAAATGTCAATATCATGATTCTTGAT 

GATGACATTCCAGAAGGAGATGAAAAATTTCAGCTGATTTTAACAAATCCTTCTCCTGGACTAGAGC 

TAGGGAAAAATACAATAGCCTTAATTATTGTCCTTGCTAATGATGACGGCCCTGGAGTTCTATCATT 

TAACAACAGTGAGCACTTTTTCCTAAGAGAGCCAACAGCTCTCTACGTCCAGGAGAGTGTTGCAGTA 

TTGTACATTGTTCGGGAACCTGCACAAGGATTGTTTGGAACAGTGACAGTTCAGTTCATTGTGACAG 

AAGTGAATTCCTCAAATGAATCTAAAGATCTGACTCCTTCCAAAGGCTATATTGTTTTAGAAGAAGG 

TGTTCGATTCAAGGCCCTACAAATATCTGCCATATTAGACACGGAACCAGAAATGGATGAGTATTTT 

GTTTGCACCTTGTTTAATCCAACTGGAGGTGCTAGACTAGGGGTGCATGTTCAAACCCTGATAACAG 

TTTTGCAAAACCAGGCCCCTTTGGGGCTATTCAGTATCTCTGCAGTTGAAAATAGAGCCACCTCCAT 

AGACATCGAAGAAGCCAATAGGACCGTGTATTTAAATGTATCTCGAACTAATGGCATTGATTTGGCT 

GTGAGTGTGCAGTGGGAGACAGTATCTGAAACAGCCTTTGGCATGAGGGGAATGGATGTTGTGTTT 

CCGTATTTCAAAGTTTTTTGGATGAATCAGCTTCTGGCTGGTGTTTCTTTACTTTGGAAAATTTAAT 

ATATGGTATAATGTTAAGAAAATCATCTCTTACTGTTTACCGATGGCAGGGGATTTTTATTCCAGTT 

GAGGATTTAAATATAGAAAATCCTAAAACTTGTGAGGCCTTTAATATTGGTTTTTCTCCCTACTTTG 

TGATTACTCATGAAGAAAGAAATGAAGAAAAGCCTTCTCTTAACAGTGTGTTTACATTCACATCTGG 
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&TTTAAATTATTCCTGGTACA^ " 

3ACAGCCAAGATTATTTAATCATTGCAAGTCAAAGAGATGATTCCGAATTAACTCAGGTCTTCAGGT 

(3GAATGGAGGAAGCTTCGTGTTGCATCAAAAACTCCCTGTCCGAGGTGTGCTGACCGTGGCCTTGTT 

CAACAAGGGAGGCTCTGTGTTCTTAGCCATTTCCCAGGCTAATGCCAGGCTAAACTCCCTTTTATTC 

AGATGGTCTGGCAGTGGGTTTATTAACTTTCAAGAGGTGCCTGTCAGTGGGACAACAGAAGTTGAGG 

CTTTGTCTTCAGCCAATGATATTTACCTAATATTTGCCAAAAATGTCTTTCTAGGAGATCAGAATTC 

AATTGATATTTTCATCTGGGAGATGGGACAGTCTTCCTTCAGGTATTTTCAGTCTGTAGATTTTGCT 

GCTGTTAACAGAATCCACTCCTTCACACCAGCCTCAGGAATAGCCCACATACTTCTTATTGGCCAAG 

ATATGTCTGCTCTTTACTGCTGGAATTCGGAGCGTAATCAATTCTCTTTTGTTCTGGAAGTACCTTC 

TGCTTATGATGTGGCTTCTGTTACAGTAAAGTCCCTTAATTCAAGCAAGAATTTAATAGCTCTAGTG 

GGAGCTC ATTC AC ATAT AT ATG AGC TAGCCTAC ATTTCC AGCC ATTC TGAC TTT ATTCCTAGTTC AG 

GTG AACTG ATATTTG AAC CTGGTGAG AGAG AAGCTAC AATAGC AGTAAATATC C TTGATGATAC AGT 

TCC AG AAAAAGAAG AATCC TTCAAAG TTC AAC TT AAAAATCC C AAAGG AGGAGCAG AG ATTGGC AT T 

AATGATTC TGT AAC AAT AACC ATTCTGTC T AATGATG ATGCCT ATGG AATTGTTGCATTTGCTC AGA 

ATTCATTATATAAGCAAGTGGAAGAAATGGAGCAAGATAGCCTAGTAACCTTGAACGTTGAACGCTT 

AAAAGGAACATATGGCCGTATAACCATAGCATGGGAAGCTGATGGAAGTATTAGTGATATATTTCCT 

ACCTCAGGAGTGATTTTATTTACTGAAGGCCAGGTACTGTCAACAATCACTCTAACTATTCTTGCTG 

ATAATATACCAGAGTTATCAGAGGTTGTGATTGTAACCCTCACCCGTATCACCACAGAAGGGGTTGA 

GGACTCATACAAAGGTGCTACTATTGATCAGGACAGAAGCAAGTCTGTTATAACAACTTTGCCCAAT 

GACTCACCTTTTGGCTTGGTGGGCTGGCGTGCTGCGTCTGTCTTCATTAGAGTAGCAGAGCCTAAAG 

AAAAC ACC ACC ACTC TTC AGTTAC AAATAGCTCGAGATAAAGGACTACTTGGGGATATTGCC AT TC A 

CTTGAGAGCTC AACCCAATTTC TTACTGC ATGTCG ATAATC AAGC T AC TG AG AATG AAGATTATGTA 

TTGCAAGAAACAATAATAATAATGAAAGAAAACATAAAAGAAGCTCATGCCGAAGTTTCCATTTTGC 

CGGATGACCTTCCTGAATTGGAGGAAGGATTTATTGTCACTATCACTGAGGTGAACCTGGTGAACTC 

TGACTTCTCTACAGGACAGCCAAGTGTGCGGAGGCCCGGAATGGAAATAGCTGAGATAATGATAGAA 

GAAAATGACG ATCCCAGAGGAATTTTTATGTTTC ATGTTACTAGAGGCGC TGGGGAAGTTATTAC TG 

CC T ATG AGGTGCC TC C ACC C TTG AACGTTC TTC AAGT TC C TGT AG TCC GG C TGGCTGG AAG C TTTGG 

GGC AGTAAATGTTTATTGGAAAGC ATCACC AGAC AGTGC TGGCC TGGAAGACTTTAAACCATCTC AT 

GGGATTCTTGAATTTGCAGATAAACAGGTTACTGCAATGATAGAAATCACCATAATTGATGATGCTG 

AATTTGAATTGACAGAGACGTTCAATATTTCCTTGATCAGTGTTGCTGGAGGTGGCAGACTTGGTGA 

TGATGTTGTGGTAACTGTTGTTATTCCACAAAATGATTCTCCATTTGGAGTATTTGGATTTGAAGAA 

aars^PTYZTa^TTAAACATATCAGGGGAAAGCCTTGTTTCAGGCTAGCGTTTCATGTAATT 


AGAAAGTGTCTC AC ATTTTTGTTTTGGAAGTC TTGGCC AGGC ATGGTGGCTC ATGCC AGTAATCCC A 


GCACTTTGGGAGGCCGCAGCGGGCAGATCACGAGGTCAGGAGATTGACACCATCCTGGCCAATATGG 


TTGAATTCCCGTCTCTACTGAAAGTACAAAAATTAGCTGGGCGTGGTGGCACATGCCTGTATTCCCA 


GATACTTGGGAGGCTGAGGCAGGAGACTCGCTTGAACCCAGGAGGCAGAGGTTGCAGTGAGCTGAGA 


TCACGCCATTGCACTCCAGCCTGGCGACATAGAGAGACTCCATCTCAAAAAAAAAAAAAAAAAAAG 




ORF Start: ATG at 23 | |ORF Stop: TAA at 1 1537 





SEQIDNO:162 13838 aa J 


MW at 421384.3kD 


NOV39b, 
CG150799-02 
Protein Sequence 


MVMVTFEVEGG PNP PDEDLS PVKGNI TF P PGRATVI YNLTVLDDEVP ENDE I FL I QLKSVEGGAE IN 
TSRNSIEIIIIUCNDSPVRFI^SIYLVPEEDHILIIPVVRGKrJNNGNLIGSDEYEVSISYAVTTGNST 
AHAQQNLDFIDLQPNTTVVFPPFIHESHLKFQIVDDTTPEIAESFHIMLLKim.QGDAVLISPSVVQ 
VTIKPNDKPYGVLSFNSVLFERWIIDEDRISRYEEITVVRNGGTHGNVSANWVLTRNSTDPSPVTA 
DIRPS SGVLHFAQGQML ATI PLTVVDDDLPEEAEAYLLQ I L PHT IRGGAEVSE PAEDSDDVYGL I TF 
F PMENQK IES S PGERYLSL SFTRLGGTKGDVRLLYS VL Y I PAGAVDPLQ AKEG I LNI SRRNDL I FPE 
QKTQVTTKL P I RNDAFFQNGAHFLVQLETVELLNI I PL I P P I S PRFGE ICNI SLLVTPAIANGEI GF 
LSNLPIILHEPEDFAAEVVYIPLHRIX5TIX3QATVYWSLKPSGFNSKAVTPDDIGPFNGSVLFLSGQS 
DTT INI T IKGDDI PEMNETVTLSLDRVNVENQVLKSG YT SRDL I ILENDDPGGVFEF S PASRG P YVI 
KEGESVELHIIRSRGSLVKQFLHYRVEPRDSNEFYGNTGVLEFKPGEREIVITLLARLDGIPELDEH 
YWVVLSSHGERESKIXSSATIWITILKNDDPHGIIEFVSDGLIVMINESKGDAr/SAVYDVVRNRGN 
FGDVSVSWWSPDFTQDVFPVQGTVVFGDQEFSKNITIYSLPDEIPEEMEEFTVILLNGTGGAKVGN 
RTTATLR I RRNDDPI YFAEPRVVRVQEGETANFTVLRNGSVDVT I PVE 
KGETLI FEVG SRQQS I S I FVNEDGI PETDE PF YI I LLNS TGDTWYQYGVATVI I EANDDPNG I FSL 
EPIDKAVEEGKTNAFWILRHRGYFGSVSVSWQLFQNDSALQPGQEFYET^^ 

AF PDK I PEFNEFYFLKLVNI SGPGGQLAETNLQVTVMVPFNDDPFGVF ILDPECLEREVAEDVL SED 
DMSYITNFTILRQQGWGDVQLGWEILSSEFPAGLPPMIDFLLVGIFPTTVHLQQHMRRHHSGTDAL 
YFTGLEGAFGTVNPKYHPSRNNTI ANFTFSAWVMPNANTNGFI IAKDDGNGS I YYGVKIQTNESHVT 
LSLHYKTLGSNATYIAKTTVMKYLEESVWLPJLLIILEDGIIEFYLDGNAMPRGIKSLKGEAITDG^ 
ILRIGAG INGNDRFTGLMQDVRSYERKLTLEEI YELHAMPAKSDLHPISGYIiEFRQGETNKSFI I SA 
RDDNDEEGEELFILKLVSVYGGARI SEENTTARLTIQKSDNANGLFGFTGACI PEIAEEGSTISCW 
ERTRG ALD YVHVF YT I SQ I ETDG INYLVDDF ANAS GT I T FL P WQRS EVLN I YVLDDD I PELNEYFRV 
TLVSAIPGDGKLGSTPTSGASIDPEKETTDITIKASDHPYGLLQFSTGLPPQPKDAMTLPASSVPHI 
TVE EEDGE I RLLV I RAQGLLGR VT AE FRTVSLT AF S PEDYQNV AGTL EFQPG ER YKYI F I N I TDNS I 
PELEKSFKVELLNLEGGVAELFRVDGSGSASLGVASQILVTIAASDHAHGVFEFSPESLFVSGTEPE 
DGYSTVTLNVIRHHGTLSPVTLHWNIDSDPIX5DIiAFTSGNITFEIGQTSANITVEILPDEDPELDKA 
FSVSVLSVSSGSI^AHINATLTVLASDDPYGIFIFSEKNRPVKVEEATQNITLSIIRLKGLMGKVLV 
SYATLDDMEKPPYFPPNLARATOGRDYI PASGFALFGANOSEATI AISILDDDEPERSESVFIELLN 
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STLVAKVQ SRS1PNS PRLG PKVET IAQL 1 1 1 ANDDAFfe^ll^L $A Itfl ^!fr^AM\R!ft)^P^ Twfa&TG&lfii? ■ 
ADVSVKFKAVPITAIAGEDYSIASSDVVLLEGETSKAVPIYVINDiyPELEESFLVQLMNETTGGAR 
LGALTEAV III EASDDPYGLFGFQ ITKL IVEEPEFNSVKVNLP I 1 RNSGTLGNVTVQWVATINGQLA 
TGDLRWSGNVTFAPGET I QTLLLEVLADDVPEI EEVI QVQLTDASGGGT IGLDRI ANI 1 1 PANDDP 
YGWAFAQMVYRVQEPLERSSCANITVRRSGGHFGRLLLFYSTSDI DWALAMEEGQDLLSYYES PI 
QGVPDPLWRTWMNVS AVGEPL YTC ATLCLKEQAC S AFSFFS AS EG PQCFWMTSWI S PAVNNSDFWTY 
RKNMTRVASLFSGQAVAGSDYEPVTRQWAIMQEGDEFAI^TVSILPDDFPEMDESFIflSLLEVHLMN 
ISASLKNQPTIGQPNISTWTALNGDAFGOTVIYSISPNTSEDGLFVEVQEQPQTLVELMIHRTGGS 
LGQVAVEWRWGGTATEGLDFIGAGEI LTFAEGETKKTVILTILDDSEPEDDES 1 1 VSLVYTEGGSR 
ILPSSDTVRVNILANDNVAGIVSFQTASRSVIGHEGEILQFHVIRTFPGRGNVTVNWKIIGQNLELN 
F ANFSGQLF FPEG SLNTTLFVHLLDDN I PEEKEVYQVIL YDVRTQGVP P AG I ALLDAQGYAAVLTVE 
ASDEPHGVLNF ALS SRFVL LQEANIT I QLFINREFGSLGAI NVT YTTVPGML SLKNQTVGNL AEPEV 
DFVPIIGFLILEEGETAAAINITIIiEDDVPELEEYFLVNLTYVGLTMAASTSFPPRLDSEGLTAQVI 
IDANDGARGVI EWQQSRFEVNETHGSLTLVAQRSREPLGHVSLFVYAQNLEAQVGLDYI FTPMI LHF 
AIX3ERYKNVNIMILDDDIPEGDEKFQLILTNPSPGLEIX3KNT^ 

LREPTALYVQESVAVLY IVREPAQGLFGTVTVQF IVTEVNS SNESKDLT PSKG YI VLEEGVRFKALQ 
I SAILDTE PEMDEYFVCTLFNP/TGGARLGVHVQTL I TVLQNQAPLGLF S I S AVENRAT S IDI EEANR 
TVYIiNVSRTNGI DLAVSVQWETVS ETAFGMRGMDWFSVFQS FLDE S ASGWC FFTLENLI YGIMLRK 
S S VTVYRWQG IF I PVEDLNI ENPKTCEAFNI GFS P YFVI THEERNEEKPSLNSVFTFT SGFKLFLVQ 
T 1 1 ILE SSQVRYFTSDSQDYL 1 I ASQRDDS ELTQVFRWNGGSFVLHQKL P VRGVLTVALFNKGGS VF 
IJ^ISQANARLNSLLFRWSGSGFINFQEVPVSGTTEVEALSSANDIYLIFAKNVFLGDQNSIDIFIWE 
MGQSSFRYFQSVDFAAVNRIHSFTPASGIT^HILLIGQDMSALYCWNSERNQFSFVLEVPSAYDVASV 
TVKSLNSSKNLIALVGAHSHIYELAYISSHSDFIPSSGELIFEPGEREATIAVNILDDTVPEKEESF 
KVQLKNPKGGAE IGINDSVTITIL SNDDAYGI VAFAQNSLYKQVEEMEQDSLVTLNVERLKGTYGRI 
TIAWEADGSISDIFPTSGVILFTEGQVLSTITLTILADNIPELSEWIVTLTRITTEGVEDSYKGAT 
IDQDRSKS VI TTLPNDS PFGLVGWRAASVF I RVAEPKENTTTLQLQI ARDKGLLGD I A IHLRAQPNF 
LLHVDNQATENEDYVLQETI I IMKENIKEAHAEVSILPDDLPELEEGFIVTITEVNLVNSDFSTGQP 
SVRRPGMEIAEIMIEENDDPRGIFMFHVTRGAGEVITAYE^^ 

AS PDSAGLEDFKPSHG ILEFADKQVTAMIEITI IDDAEFELTETFNI SL I SVAGGGRLGDDWVTW 
I PQNDS PFGVFGFEEKTVS 





SEQ ID NO: 163 |5102bp 




NOV39c, 
CG150799-03 
DNA Sequence 


CAGGGAAAAGGGAACCTATGGAATGGTCATGGTGACTTTTGAGGTAGAGGGTGGCCCAAATCCCCCT 
GATGAAGATTTGAGTCCAGTTAAAGGAAATATCACCTTTCCCCCTGGCAGAGCAACAGTAATTTATA 
ACTTGACAGTAC TCGATGACGAGGTACC AGAAAATGATGAAATATTTTT AATT CAACTGAAAAGTGT 
AGAAGGAGGAGCTGAGATTAACAC CTC TAGG AAT TCC ATTG AGATC ATC ATTAAGAAAAATG ATAGT 
CCCGTGAGATTCCTTCAGAGTATTTATTTGGTTCCTGAGGAAGACCACATACTCATAATTCCAGTAG 
TTC GTGG AAAGG AC AAC AATGGAAATC TGAT TG G ATC TG ATG AAT A TG AGGT T TC AATC AGTTATGC 
TGTCAC AACTGGGAATTCC AC AGCAC ATGCC C AGC AAAATCTGGAC TTC ATTGATCTTCAGCCAAAC 
ACAACTGTTGTTTTTCCACCTTTTATTCATGAATCTCACTTGAAATTTCAAATAGTTGATGACACCA 
CACCGGAGATTGCTGAATCGTTTCACATTATGTTACTAAAAGATACCTTACAGGGAGATGCTGTGCT 
AATAAGCC CTTC TGTTGTACAAGTCACC ATTAAGC C AAATG ATAAACCTTATGG AGTC CTTTC ATTC 
AACAGTGTTTTGTTTGAAAGGACAGTTATAATTGATOAAGATAGAATATCAAGATATGAAGAAATCA 
GAGTGGTTAGAAATGGAGGAACCCATGGGAATGTCTCTGCGAATTGGGTGTTGACACGGAACAGCAC 
TGATCCCTCACCAGTAACAGCAGATATCAGACCGAGCTCTGGAGTTCTCCATTTTGCACAAGGGCAG 
ATGTTGGC AAC AATTC C TCTTAC TGTGGTTG ATGATG ATCTTC C AG AAG AGGC AGAAGC TTATC TAC 
TTCAAATTCTGCCTCATACAATACGAGGAGGTGCAGAAGTGAGCGAGCCAGCGGAGGATAGTGATGA 
TGTCTATGGCCTAATAACATTTTTTCCTATGGAAAACCAGAAGATTGAAAGCAGCCCAGGTGAACGA 
TACTTATCCTTGAGTTTTAC AAGACTAGGAGGGAC TAAAGGAGATGTGAGGTTGCTTTATTC TGTAC 
TTTACATTCCTGCTGGAGCTGTGGACCCCTTGCAAGCAAAAGAAGGCATCTTAAATATATCAAGGAG 
AAATGACCTCATTTTTCCAGAGCAAAAAACTCAAGTCACTACAAAATTACCAATAAGAAATGATGCA 
TTCTTTCAAAATGGAGCTCACTTTCTAGTACAGTTGGAAACTGTGGAGTTGTTAAACATAATTCCTC 
TAATCCCACCC ATAAGCCC TAGATTTGGGGAAATC TGC AATATTTC TTT ACTGGTTACTC C AGCCAT 
TGCAAATGGAGAAATTGGCTTTCTCAGCAATCTTCCAATTATTTTGCATGAACCAGAAGATTTTGCT 
GCTGAAGTGGTATACATTCCCTTACATCGGGATGGAACTGATGGCCAGGCTACTGTCTACTGGAGTT 
TGAAGCCCTCTGGCTTTA^TTCAAAAGCAGTGACCCCGGATGATATAGGCCCCTTTAATGGCTCTGT 
TTTGTTTTTATCTGGGCAAAGTGACACAACAATCAACATTACTATCAAAGGTGATGACATACCGGAA 
ATGAATGAAACTGTAACACTTTCTCTAGACAGGGTTAACGTGGAAAACCAAGTGCTGAAATCTGGAT 
ATACTAGCCGTGACCTT^ATTATTTTGGAAAATGATGACCCTGGGGGAGTTTTTGAATTTTCTCCTGC 
TTCCAGAGGACCCTATGTTATAAAAGAAGGAGAATCTGTAGAGCTCCACATCATCCGATCAAGGGGG 
TCCCTTGTTAAGCAGTTTCTACACTACCGAGTAGAGCCAAGAGATAGCAATGAATTCTATGGAAACA 
CGGGAGTACTAGAATTTAAACCTGGAGAAAGGGAGATAGTGATCACCTTGCTAGCAAGATTGGATGG 
GATACCAGAGTTGGAT<3AACACTACTGGGTGGTCCTCAGCAGCCACGGAGAACGGGAAAGCyvAGTTG 
GGAAGTGCC^CCATTGTC^TATAACGATTCTGAAAAATGATGATCCTCATGGCATTATAGAATTTG 
TTTCTGATGGTCTAATTGTGATGATAAATGAAAGCAAAGGAGATGCTATCTATAGTGCTGTTTATGA 
TGTAGTAAGAAATCGAGGCAACTTTGGTGATGTTAGTGTATCATGGGTGGTTAGTCCAGACTTTACA 
CAAGATGTATTTCCTGTACAAGGGACTGTTGTCTTTGGAGATCAGGAATTTTCAAAAAATATCACCA 
TTTACTCCCTTCCAGATGAGATTCCAGAAGAAATGGAAGAATTTACCGTTATCCTACTGAATGGCAC 
TG G AGGAG C T AAAGTGGG AAATAGAAC AAC TG CAACTC TGAGGATTAG AAG AAATG ATG AC C C C ATT 
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TATTTTGCAGAACCTCGTGTAGTGAGGGTTCAGGAA(!ld • 

ATGGATCTGTTGATGTGACTTGCATGGTCCAGTATGCTACCAAGGATGGGAAGGCTACTGCAAGAGA 

GAGAG ATTTC ATTC C TGTTG AAAAAGG AGAAAC GC TC ATTT TTG AGGTTGGAAGTAGAC AGC AGAGC 

ATATCC ATATTTGTTAATGAAGATGGTATCC CGG AAAC AGATGAGCCC T TTTATATAATCC TC TTGA 

ATTC AACAGGTGATAC AGTAGTATATCAATATGGAGTAGC TAC AGT AATAATTGAAGC TAATGATGA 

CCCAAATGGCATTTTTTCTCTGGAGCCCATAGACAAAGCAGTGGAAGAAGGAAAGACTAATGCATTT 

TGGATTTTGAGGCACCGAGGATACTTTGGTAGTGTTTCTGTATCTTGGCAGCTCTTTCAGAATGATT 

C TGC TTTGC AGCCTGGGC AGG AGTTCT ATGAAACTTC AGGAACTGTTAAC TTC ATGG ATGG AGAAG A 

AGCAAAACCAATCATTCTCCATGCTTTTCCAGATAAAATTCCTGAATTCAATGAATTTTATTTCCTA 

AAACTTGTAAACATTTCAGGTCCTGGGGGCCAGCTAGCAGAAACCAACCTCCAGGTGACAGTAATGG 

TTCCATTCAATGATGATCCCTTTGGAGTTTTTATCTTGGATCCAGAGTGTTTAGAGAGAGAAGTGGC 

AGAAGATGTCCTGTCTGAAGATGATATGTCTTATATTACCAACTTCACCATTTTGAGGCAGCAGGGT 

GTGTTTGGTGATGTACAACTGGGCTGGGA^TACTGTCCAGTGAGTTCCCTGCTGGTTTGCCACCAA 

TGATAGATTTTTTACTGGTTGGAATTTTCCCCACCACCGTGCATTTACAACAGCACATGCGGCGTCA 

CCACAGTGGAACGGATGCTTTGTACTTTACCGGACTAGAGGGTGCATTTGGGACTGTTAATCCAAAA 

TACCATCCCTCCAGGAATAATACAATTGCCAACTTTACATTCTCAGCTTGGGTAATGCCCAATGCCA 

ATACGAATGGATTCATTATAGCGAAGGATGACGGTAATGGAAGCATCTACTACGGGGTAAAAATACA 

AAC AAACGAATCCC ATGTGAC ACTT TC CCTTCATTATAAAACCTTGGGTTCC AATGCTACATAC ATT 

GCCAAGACAACAGTCATGAAATATTTAGAAGAAAGTGTTTGGCTTCATCTACTAATTATCCTGGAGG 

ATGGTATAATCGAATTCTACCTGGATGGAAATGCAATGCCCAGGGGAATCAAGAGTCTGAAAGGAGA 

AGCCATTACTGACGGTCCTGGGATACTGAGAATTGGAGCAGGGATAAATGGCAATGACAGATTTACA 

GGTCTGATGCAGGATGTGAGGTCCTATGAGCGGAAACTGACGCTTGAAGAAATTTATGAACTTCATG 

CCATGCCCGCAAAAAGTGATTTACACCCAATTTCTGGATATCTGGAGTTCAGACAGGGAGAAACTAA 

CAAATCATTCATTATTTCTGCAAGAGATGACAATGACGAGGAAGGAGAAGAATTATTCATTCTTAAA 

r T A GTTTPTGT AT A TGG AGG AGCTCGT ATTTCGGAAGAAAATACTAC TGC AAGATTAAC AATACAAA 

AAAGTGACAATGCAAATGGCTTGTTTGGTTTCACAGGAGCTTGTATACCAGAGATTGCAGAGGAGGG 

ATCAACCATTTCTTGTGTGGTTGAGAGAACCAGAGGAGCTCTGGATTATGTGCATGTTTTTTACACC 

arnmmp ar* AGaTTGA A APTGATGGCATTAATTACCTTGTTGATGACTTTGCTAATGCCAGTGGAAC TA 

n , T»iir , Aa , TnPTT^f , ^TGGr'aGanATPAGAGrTTTTGATTGAAGTGTCGCTTCCCATTATTATTTACAA 

PTY^n^aar'TflATAraTTAGAATTTGCTTCAAACATGTCTGCTGTAAAACCTTTATCAGGTTCTGAATA 

T'aa»aT , nTTPTTGaTGATGATATTCCTGAACTTAATGAGTATTTCCGTGTGACATTGGTTTCTGCAAT 




iprr^nrr' ar2 & »rrwrt a anp rp arifipTr 1 A APTPPT Aft" 1 AGTGGTGC AAGC ATAG ATCCTGAAAAGGAAACG 




AC TG ATATC ACC ATC AAAGC TAGTGATC ATCC ATATGGCTTGC TGC AGTTCTCC AC AGGGCTGCCTC 




CTCAGCCTAAGGACGCAATGACCCTGCCTGCAAGCAGCGTTCCACATATCACTGTGGAGGAGGAAGA 




TGGAG AAATC AGGTT ATTGGTC ATC C G TGC AC AGGG AC TT C TGGG AAGGG TG ACT GCGG AATTT AG A 




ACAGTGTCCTTGACAGCATTCAGTCCTGAGGATTACCAGAATGTTGCTGGCACATTAGAATTTCAAC 




CAGGAGAAAGATATAAATACATTTTCATAAACATCACTGATAATTCTATTCCTGAACTGGAAAAATC 




TTTT AAAGTTGAGTTGTT AAACTTGG AAGGAGGAGCTCTGC T AGATC TATCTAC AGAT ATAACGC TG 




TAAAATCTGGTCCTTTTGGATGATCTATAATGAGTTGATTATTAATAAAAGAAGTCAAC7VATACCTT 




AAAAAAAAAA 




ORF Start: ATG at 23 | |ORF Stop: TGA at 4430 





SEQ ID NO: 164 


1469 aa |MW at 162809.6kD 


NOV39c, 
CG150799-03 
Protein Sequence 


MVMVTFEVEGGPNPPDEDLSPVKGNITFPPGRATVIY^ 

TSRNSIEI I IKKNDS PVRFLQS I YLVPEEDHILI I PWRGKDNNGNL IGSDEYEVS I SYAVTTGNS T 
AHAQQNLDF IDLQPNTTVVFPPFI HESHLKFQ IVDDTTPEI AESFH IMLLKDTLQGDAVL IS PS WQ 
VTIKPNDKP YGVL SFNSVLFERTV I IDEDR I SRYEE I TVVRNGGTHGNVS ANWVLTRNSTDPSPVTA 
DIRPSSGVLHFAQGQMLATIPLTVVDDDLPFJ2AEAYL^ 

F PMENQKI ESS PGERYLSLSFTRLGGTKGDVRLLYSVLYI PAGAVDPLQAKEGI LNI SRRNDLI FPE 
QKTQ VTTKL P IRNDAFFQNGAHFLVQLETVELLNI I PLI PP I S PRFGE I CN I SLLVT P AI ANGE IGF 
LSNLP I ILHEPEDFAAEWYI PLHRDGTDGQATVYWSLKPSGFNSKAVTPDDIGPFNGSVLFLSGQS 
iyTTINITIKGDDIPEMNEWTLSLDRVNVENQVLKSGYTSRDLIILENDDPGGVFEFSPASRGPYVI 
KEGESVELHIIRSRGSLVKQFLHYRVEPRDSNEFYGNTGVLEFKPGEREIVITLLARLDGIPELDEH 
YWA^SSHGFJlESKI^SATIWITILKira^PHGI^^ 

FGDVSVSWVVSPDFTQDVFPVQGTVVFGDQEFSKN IT I YSL PDEI PEEMEEFTVI LLNGTGGAKVGN 
RTTATLRI RRNDDP I YFAEPR WRVQEGETANFTVLRNG S VDVTCMV Q Y ATKDGKATARERDF I PVE 
KGETLIFEVGSRQQSI SI FVNEDGIPETDEPFYI ILLNSTGDTVVYQYGVATVI I EANDDPNGIFSL 
EPI DKAVEEGKTNA^WI LRHRG YFGSVSVSWQLFQNDS AXQPGQEFYETSGTVNFMIXSEFJ^P I ILH 
AFPDKI PEFNEF YFLKLVNI SG PGGQLAETNLQVTVMVP FNDDPFGVFI LDPECLEREVAEDVLSED 
DMS YI TNFT I LRQQGVFGDVQLGWEILSSEFPAGLPPMIDF LLVG IF PTTVHLQQHMRRHHSGTDAL 
YFTGLEGAFGTWPKYHPSRI^IANFTFSAWVMPNANTNGFIIAKDDGNGSIYYGVKIQTNESH^ 
LSLHYKTLGSNATYIAJCTTVMKYLEESVWLHLLIILFJ3GIIEFYLDGNAMPRGIKSLKGEAITDGPG 
ILRI GAGI NGNDRFTGLMQDWSYERKLTLEEI YELHAMPAKSDLHP I SGYLEFRQGETNKS F I ISA 
RDDNDEEGEELF ILKLVS VYGG AR I SEENTTARLT IQKSDNANGL FGFTG AC I PEI AEEGS T ISCW 
ERTRGALDYVHVFYT I SQIETDGI NYLVDDFANASGT I TFLPWQRSELL I EVSLPI 1 1 YNCN 
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SEQIDNO: 165 



8350 bp 



NOV39d, 
CG150799-01 
DNA Sequence 



CAGGGAAAAGGGAACCTATGGAA TQGTCATGGTGACTTTTGAGGTAGAGGGTGGCCCAAATCCCCCT 



GATGAAGATTTGAGTCCAGTTAAAGGAAATATCACCTTTCCCCCTGGCAGAGCAACAGTAATTTATA 
ACTTGACAGTACTCGATGACGAGGTACCAGAAAATGATGAAATATTTTTAATTCAACTGAAAAGTGT 
AGAAGGAGGAGCTGAGATTAACACCTCTAGGAATTCCATTGAGATCATCATTAAGAAAAATGATAGT 
CCCGTGAGATTCCTTCAGAGTATTTATTTGGTTCCTGAGGAAGACCACATACTCATAATTCCAGTAG 
TTCGTGGAAAGGACAACAATGGAAATCTGATTGGATCTGATGAATATGAGGTTTCAATCAGTTATGC 
TGTC AC AAC TGGGAATTC C AC AGC AC ATGC C C AGCAAAATCTGG AC TTC ATTG ATCTTC AGCCAAAC 
ACAACTGTTGTTTTTCCACCTTTTATTCATGAATCTCACTTGAAATTTCAAATAGTTGATGACACCA 
CACCGGAGATTGCTGAATCGTTTCACATTATGTTACTAAAAGATACCTTACAGGGAGATGCTGTGCT 
AATAAGCCCTTCTGTTGTACAAGTCACCATTAAGCCAAATGATAAACCTTATGGAGTCCTTTCATTC 
AACAGTGTTTTGTTTGAAAGGACAGTTATAATTGATGAAGATAGAATATCAAGATATGAAGAAATCA 
CAGTGGTTAGAAATGGAGGAACCCATGGGAATGTCTCTGCGAATTGGGTGTTGACACGGAACAGCAC 
TGATCCCTCACCAGTAACAGCAGATATCAGACCGAGCTCTGGAGTTCTCCATTTTGCACAAGGGCAG 
ATGTTGGCAAC AATTCCTC TT AC TGTGGTTGATGATGATCTTC C AGAAGAGGC AGAAGCTTATCTAC 
TTCAAATTCTGCCTCATACAATACGAGGAGGTGCAGAAGTGAGCGAGCCAGCGGAGGATAGTGATGA 
TGTCTATGGCCTAATAACATTTTTTCCTATGGAAAACCAGAAGATTGAAAGCAGCCCAGGTGAACGA 
TACTTATCCTTGAGTTTTACAAGACTAGGAGGGACTAAAGGAGATGTGAGGTTGCTTTATTCTGTAC 
TTTACATTC CTGC TGGAGCTGTGG AC CCCTTGC AAGCAAAAGAAGGC ATCTTAAATATATCAAGGAG 
AAATGACC TC ATTTTTCC AGAGCAAAAAAC TCAAGTC ACTAC AAAATT ACC AATAAG AAATGATGC A 
TTCTTTCAAAATGGAGCTCACTTTCTAGTACAGTTGGAAACTGTGGAGTTGTTAAACATAATTCCTC 
TAATCCCACCCATAAGCCCTAGATTTGGGGAAATCTGCAATATTTCTTTACTGGTTACTCCAGCCAT 
TGCAAATGGAGAAATTGGCTTTCTCAGCAATCTTCCAATTATTTTGCATGAACCAGAAGATTTTGCT 
GCTGAAGTGGTATACATTCCCTTACATCGGGATGGAACTGATGGCCAGGCTACTGTCTACTGGAGTT 
TGAAGCCC TCTGGC TTT AATTC AAAAGC AGTGACCCCGGATGATATAGGCCCCTTTAATGGCTCTGT 
TTTGTTTTTATCTGGGCAAAGTGACACAACAATCAACATTACTATCAAAGGTGATGACATACCGGAA 
ATGAATGAAACTGTAAC ACTTTC TC TAGAC AGGGTTAACGTGGAAAACCAAGTGCTGAAATCTGGAT 
ATACTAGCCGTGACCTAATTATTTTGGAAAATGATGACCCTGGGGGAGTTTTTGAATTTTCTCCTGC 
TTCCAGAGGACCCTATGTTATAAAAGAAGGAGAATCTGTAGAGCTCCACATCATCCGATCAAGGGGG 
TCCCTTGTTAAGC AGTTTC TAC AC T AC CGAGTAGAGCC AAGAGATAGC AATGAATTCTATGGAAAC A 
CGGGAGTACTAGAATTTAAACCTGGAGAAAGGGAGATAGTGATCACCTTGCTAGCAAGATTGGATGG 
GATACCAGAGTTGGATGAACACTACTGGGTGGTCCTCAGCAGCCACGGAGAACGGGAAAGCAAGTTG 
GGAAGTGCCACCATTCTCAATATAACGATTCTGAAAAATGATGATCCTCATGGC^TTATAGAATTTG 
TTTC TGATGGTCTAATTGTGATG ATAAATGAAAGC AAAGGAGATGCTATCTATAGTGC TGTTTATGA 
TGTAGTAAGAAATCG AGGC AAC TTTGGTGATGTTAGTGTATCATGGGTGGTT AGTCC AGAC TTTACA 
CAAGATGTATTTCC TGTACAAGGGAC TGTTGTCTTTGGAGATC AGGAATTTTC AAAAAATATCACCA 
TTTACTCCCTTCCAGATGAGATTCCAGAAGAAATGGAAGAATTTACCGTTATCCTACTGAATGGCAC 
TGGAGGAGCTAAAGTGGGAAAT AGAAC AACTGCAAC TC TGAGGATTAGAAGAAATGATGACCCCATT 
TATTTTGCAGAACCTCGTGTAGTGAGGGTTCAGGAAGGTGAGACTGCCAACTTTACAGTTCTCAGAA 
ATGGATCTGTTGATGTGACTTGCATGGTCCAGTATGCTACCAAGGATGGGAAGGCTACTGCAAGAGA 
GAGAGATTTCATTCCTGTTGAAAAAGGAGAAACGCTCATTTTTGAGGTTGGAAGTAGACAGCAGAGC 
ATATCC ATATTTGTT AATGAAGATGGTATC CCGGAAAC AGATGAGCCC TTTTATATAATCCTCTTGA 
ATTCAACAGGTGATACAGTAGTATATCAATATGGAGTAGCTACAGTAATAATTGAAGCTAATGATGA 



TGGATTTTGAGGCACCGAGGATACTTTGGTAGTGTTTCTGTATCTTGGCAGCTCTTTCAGAATGATT 
CTGCTTTGCAGCCTGGGCAGGAGTTCTATGAAACTTCAGGAACTGTTAACTTCATGGATGGAGAAGA 
AGCAAAACCAATCATTCTCCATGCTTTTCCAGATAAAATTCCTGAATTCAATGAATTTTATTTCCTA 
AAAC TTGTAAAC ATTTC AGGTCC TGGGGGC C AGCTAGC AGAAACC AACCTC C AGGTG ACAGTAATGG 
TTCCATTCAATGATGATCCCTTTGGAGTTTTTATCTTGGATCCAGAGTGTTTAGAGAGAGAAGTGGC 
AGAAGATGTCCTGTC TGAAGATG ATATGTCTTATATTACCAAC TTC ACCATTTTGAGGCAGCAGGGT 
GTGTTTGGTGATG TAC AACTGGGCTGGGAAATACTGTCCAGTGAGTTCCCTGC TGGTT TGCCACC AA 
TGATAGATTTTTTACTGGTTGGAATTTTCCCCACCACCGTGCATTTACAACAGCACATGCGGCGTCA 
CCACAGTGGAACGGATGCTTTGTACTTTACCGGACTAGAGGGTGCATTTGGGACTGTTAATCCAAAA 
TACCATCCCTCCAGGAATAATACAATTGCCAACTTTAC^TTCTCAGCTTGGGTAATGCCCAATGCCA 
ATACGAATGGATTCATTATAGCGAAGGATGACGGTAATGGAAGCATCTACTACGGGGTAAAAATACA 
AACAAACGAATCCCATGTGACACTTTCCCTTCATTATAAAACCTTGGGTTCCAATGCTACATACATT 
GCCAAGACAACAGTCATGAAATATTTAGAAGAAAGTGTTTGGCTTCATCTACTAATTATCCTGGAGG 
ATGGTATAATCGAATTCTACCTGGATGGAAATGCAATGCCCAGGGGAATCAAGAGTCTGAAAGGAGA 
AGCCATTACTGACGGTCCTGGGATACTGAGAATTGGAGCAGGGATAAATGGCAATGACAGATTTACA 
GGTCTGATGCAGGATGTGAGGTCCTATGAGCGGAAACTGACGCTTGAAGAAATTTATGAACTTCATG 
CCATGCCCGCAAAAAGTGATTTACACCCAATTTCTGGATATCTGGAGTTCAGACAGGGAGAAACTAA 
CAAATCATOCATTATTTCTGCAAGAGATGACAATGACGAGGAAGGAGAAGAATTATTCATTCTTAAA 
CTAGTTTCTG TATATGGAGGAGC TCGTATTTCGGAAGAAAATACTACTGC AAGATTAAC AATACAAA 
AAAGTGACAATGCAAATGGCTTGTTTGGTTTCACAGGAGCTTGTATACCAGAGATTGCAGAGGAGGG 
ATCAACCATTTCTTGTGTGGTTGAGAGAACCAGAGGAGCTCTGGATTATGTGCATGTTTTTTACACC 
ATTTCAC AGATTGAAACTGATGGCATTAATTACCTTGTTGATGAC TTTGCTAATGC CAGTGGAACTA 
TTACATTCCTTCCTTGGCAGAGATCAGAGGTTCTGAATATATATGTTCTTGATGATGATATTCCTGA 
ACTTAATX3AGTAT0TCCGTGTGACATTGGTTTCTGCAATTCCTGGAGATGGGAAGCTAGGCTCAACT 
CCTACCAGTGGTGCAAGCATAGATCCTGAAAAGGAAACGACTGATATCACCATCAAAGCTAGTGATC 
ATCCATATGGCTTGCTGCAGTTCTCCACAGGGCTGCCTCCTCAGCCTAAGGACGCAATGACCCTGCC 
TGCAAGCAGCGTTCCACATATCACTGTGGAGGAGGAAGATGGAGAAATCAGGTTATTGGTCATCCGT 
GC ACAGGGACTTC TGGGAAGGGTGACTGCGGAATTTAGAACAGTGTCC TTGAC AGC ATTCAGTCCTG 
AGGATTACCAGAATGTTGCTGGCACATTAGAATTTCAACCAGGAGAAAGATATAAATACATTTTCAT 
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AAACATCACTGATAATTCTATTCCTGAACTC^ 

GGAGGAGTAGCTGAACTCTTTAGGGTTGATGGAAGTGGTAGTGCCAGTCTAGGAGTGGCTTCCCAAA 
TTCTAGTGACAATTGCAGCCTCTGACCACGC TCATGGCGTATTTG AATTTAGCCCTGAGTC ACTC TT 
TGTCAGTGGAACTGAACCAGAAGATGGGTATAGCACTGTTACATTAAATGTTATAAGACATCATGGA 
ACTCTGTCTCCAGTGACTTTGCATTGGAACATAGACTCTGATCCTGATGGTGATCTCGCCTTCACCT 
CTGGCAACATCACATTTGAGATTGGGCAGACGAGCGCCAATATCACTGTGGAGATATTGCCTGACGA 
AGACCCAGAACTGGATAAGGCATTCTCTGTGTCAGTCCTCAGTGTTTCCAGTGGTTCTTTGGGAGCT 
CATATTAATGCCACGTTAACAGTTTTGGCTAGTGATGATCCATATGGGATATTCATTTTTTCTGAGA 
AAAACAGACCTGTTAAAGTTGAGGAAGCAACCCAGAACATCACACTATCAATAATAAGGTTGAAAGG 
CCTC ATGGGAAAAGTC C TTGTC TC ATATGC AACAC TAGATGATATGG AAAAACCACCTTATTTTCC A 
CCTAATTTAGCGAGAGCAACTCAAGGAAGAGACTATATACCAGCTTCTGGATTTGCTCTTTTTGGAG 
CTAATCAGAGTGAGGCAACAATAGCTATTTCAATTTTGGATGATGATGAGCCAGAAAGGTCCGAATC 
TGTCTTTATCGAACTACTCAACTCTACTTTAGTAGCGAAAGTACAGAGTCGTTCAATTCCAAATTCT 
CC ACGTCTTGGGC C TAAGGTAG AAACTATTGCGCAAC TAATTATC ATTGCC AATGATGATGC ATTTG 
GAACTCTTCAGCTCTCAGCACCAATTGTCCGAGTGGCAGAAAATCATGTTGGACCCATTATCAATGT 
GACTAGAACAGGAGGAGCATTTGCAGATGTCTCTGTGAAGTTTAAAGCTGTGCCAATAACTGCAATA 
GCTGGTGAAG ATT ATAGTATAGC TTC ATC AGATGTGGTCTTGC TAGAAGGGG AAACC AGTAAAGC CG 
TGCC AATATATGTC ATTAATG ATATC TATCC TG AACTGGAAG AATC TTTTCTTGTGCAACTGATGAA 
TGAAACAACAGGAGGAGCCAGACTAGGGGCTTTAACAGAGGCAGTCATTATTATTGAGGCCTCTGAT 
GACCCCTATGGATTATTTGGTTTTCAGATTACTAAACTTATTGTAGAGGAACCTGAGTTTAACTCAG 
TGAAGGTAAACCTGCCAATAATTCGAAATTCTGGGACACTCGGCAATGTTACTGTTCAGTGGGTTGC 
CACCATTAATGGACAGCTTGCTACTGGCGACCTGCGAGTTGTCTCAGGTAATGTGACCTTTGCCCCT 
GGGGAAAC C ATTC AAAC C TTGTTGTTAGAGGTCCTGGC TGACGACGTTCCGG AGATTGAAGAGGTTA 
TCCAAGTGCAACTAACTGATGCCTCTGGTGGAGGTACTATTGGGTTAGATCGAATTGCAAATATTAT 
TATTCCTGCCAATGATGATCCTTATGGTACAGTAGCCTTTGCTCAGATGGTTTATCGTGTTCAAGAG 
CCTCTGGAAAGAAGTTCCTGTGCTAATATAACTGTCAGGCGAAGCGGAGGGCACTTTGGTCGGCTGT 
TGTTGTTCTACAGTACTTCCGACATTGATGTAGTGGCTCTGGCAATGGAGGAAGGTCAAGATTTACT 
GTCC TACTATGAATC TCC AATTCAAGGGGTGCC TGACCCACTTTGGAGAACTTGGATGAATGTCTCT 
GCCGTGGGGGAGCCCCTGTATACCTGTGCCACTTTGTGCCTTAAGGAACAAGCTTGCTCAGCGTTTT 
CATTTTTCAGTGCTTCTGAGGGTCCCCAGTGTTTCTGGATGACATCATGGATCAGCCCAGCTGTCAA 
CAATTCAGACTTCTGGACCTACAGGAAAAACATGACCAGGGTAGCATCTCTTTTTAGTGGTCAGGCT 
GTGGC TGGGAGTGAC T ATGAGCCTGTGACAAGGCAATGGGCC ATAATGCAGGAAGGTGATGAATTCG 
CAAATCTCACAGTGTCTATTCTTCCTGATGATTTCCCAGAGATGGATGAGAGTTTTCTAATTTCTCT 
CCTTGAAGTTCACCTCATGAACATTTCAGCCAGTTTGAAAAATCAGCCAACCATAGGACAGCCAAAT 
AT T TC TAC AGTTGTC ATAGC ACTAAATGGTGATGCCTTTGGAGTGTTTGTGATCTAC AATATTAGTC 
CCAATACTTCCGAAGATGGCTTATTTGTTGAAGTTCAGGAGCAGCCCCAAACCTTGGTGGAGCTGAT 
GATACACAGGACAGGGGGCAGCTTAGGTCAAGTGGCAGTCGAATGGCGTGTTGTTGGTGGAACAGCT 
AC TG AAGGTTTAG AT T TT AT AGGTGC TGGAG AG AT TC TG ACC TTTG C TGAAGGTG AAAC C AAAAAGA 
CAGTCATTTTAACCATCTTGGATGACTCTGAACCAGAGGATGACGAAAGTATCATAGTTAGTTTGGT 
GTACACTGAAGGT<3GAAGTAGAATTTTGCC^GCTCCGAC^CT<3TTAGAGTGAACATTTTGGCCAAT 
GACAATGTGGCAGGAATTGTTAGCTTTCAGACAGCTTCCAGATCTGTCATAGGTCATGAAGGAGAAA 

TGGGCAAAATCTAGAACTCAATTTTGCTAACTTTAGCGGACAACTTTTCTTTCCTGAGGGGTCGTTG 
AATACAACATTGTTTGTGC ATTTG TTGGATGACAAC AT TC C TG AGGAG AAAGAAGTATAC C AAGTC A 
TTCTGTATGATGTCAGGACACAAGGAGTTCCACCAGCCGGAATCGCCCTGCTTGATGCTCAAGGATA 
TGCAGCTGTCCTCACAGTAGAAGCCAGTGATGAACCACATGGAGTTTTAAATTTTGCTCTTTCATCA 
AGATTTGTGTTACTACAAGAGGCTAACATAACAATTCAGCTTTTCATCAACAGAGAATTTGGATCTC 
TAGGAGCTATCAATGTCACATATACCACGGTTCCTGGAATGCT<*AGTCTGAAGAACCAAACAGTAGG 
AAACCTAGCAGAGCCAGAAGTTGATTTTGTCCCTATCATTGGCTTTCTGATTTTAGAAGAAGGGGAA 
ACAGCAGCAGCCATCAACATTACCATTCTTGAGGATGATGTACCAGAGCTAGAAGAATATTTCCTGG 
TGAATTTAACTTACGTTGGACTTACCATGGCTGCTTCAACTTCATTTCCTCCCAGACTAGGTATGAG 
(^GTTTCTTGTTTGTTTCTTTTTGCTCACTTCAAATGAAATGAAGAAACTTCATTTTTGAATCAGA^ 


GTGATCATTGTGCTGTTTTGTTAATCTTAGCTATGTGTTAAA j 




ORF Start: ATG at 23 j |0RF Stop: TGA at 8282 





SEQ ID NO: 166 


2753 aa |MW at 301743.8kD 


NOV39d, 
CG150799-01 
Protein Sequence 


MVMVTFEVEGGPNPPDEDL S PVKGN I TF PPGRATVI YNLTVLDDEVPENDE I FL I QLKS VEGG AE IN 
TSRNSIEIIIKKNDSPVRFLQSIYLVPEEDHILIIPWRGKDNNGNLIGSDEyEVSISYAVTTGNST 
AHAQQNLDFIDLQPNTTWFPPFIHESHLKFQIVDDTTPEIAESFHIMLLKDTLQGDAVLISPSVVQ 
VTIKPNDKPYGVLSFNSVLFERTVI IDEDRI SR YE E I TWRNGGTHGNV S ANWVLT RNSTDP S P VT A 
DIRPSSGVLHFAQGQMLAT I PLTWDDDLPEEAEAYLLQIL PHT IRGGAEVSEPAEDSDDVYGL I TF 
F PMENQKI ES S PGERYIiSL S FTRLGGTKGDVRLIj YSVLYI P AGAVDPLQAKEGILNI SKRNDLIFPE 
QKTQVTTKLPIFNDAFFQNGAHFLVQLETVELLNI I PLI PPI S PRFGE ICNI SLL VT PAIANGEIGF 
LSNLPI ILHEPEDFAAEWYI PLHRIX3TDGQATVYWSLKPSGFNSKAVTPDDIGPFNGSVLFLSGQS 
DTTINITIKGDDIPEMNETVTLSLDRVNVENQVLKSGYTSRDLIILENDDPGGVFEFSPASRGPYVI 
KEGESVELHI IRSRGSLVKQFLHYRVE PRDSNEF YGNTGVLEFK PGERE IVI TLLARLDG I PELDEH 
YWVVLSSHGERESKLGSATIVNITILKNDDPHGIIEFVSDGLIVMINESKGDAIYSAVYDVVRNRGN 
FGDVSVSWWS PDFTODVF PVOGTWFGDOEFSKNITI YSLPDEI PEEMEEFTVILLNGTGGAKVGN 
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KGETL I FEVGSRQQS I SI FVNEDG I PETDEPFYI I LLNSTGDTWYQ YGVATVI I EANDDPNGIFSL 
EPIDKAVEEGKTNAFWILRHRGYFGSVSVSWQLFQNDSALQPGQEFYETSGTVNFMDGEEAKPIIL^ 
AFPDKIPEFNEFYFLKLVNISGPGGQLAETNLQVTVMVPFNDDPFGVFILDPECLEREVAEDVLSED 
DMS Y I TNFT ILRQQGVFGDVQLGWE I L SS EFPAGLPPMI DFLLVG I FPTTVHLQQHMRRHHSGTDAL 
YFTGLEGAFGTV^KYHPSRNNTIANFTFSAWVMPNANTNGFIIJUCDDGNGSIYYGVKIQTNESHVT 
LSLHYKTLGSNATYIAKTTVMKYLEESVWLHLLIILEDGIIEFYLDGNAMPRGIKSLKGEAITDGPG 
ILRIGAGINGNDRFTGLMQDVRSYERKLTLEEIYELHAMPAKSDLHPI SGYLEFRQGETNKSFI ISA 
R DDNDE EG EELF I LKLVS VYGG AR I SEENTTARLT I QKSDNANGLFGFTGAC I PEI AEEGSTI SCW 
ERTRGALDYVHVFYTISQIETDGINYLVDDFANASGTITFLPWQRSEVLNIYVLDDDIPELNEYFRV 
TLVSAIPGOGKLGSTPTSGASIDPEKETTDITIKASDHPYGLLQFSTGLPPQPK0AMTLPASSVPHI 
TVEEEDGE I RLLVIRAQGLLGRVT AEFRTVSLTAFS PEDYQNVAGTLEFQ PGER YKYI F IN I TONS I 
PEIiEKSFKVELLNLEGGVAELFRVDGSGSASLGVASQILVTIAASDHAHG\7FEFSPESLFVSGTEPE 
DGYSTVTl^IRHHGTLSPVTLHWNIDSDPDGDLAFTSGNITFEIGQTSANITVEILPDEDPELDKA 
FSVSVLSVSSGSLGAHINATLTVLASDDP YGIF IFSEKNRPVKVEEATQNI TLS I IRLKGLMGKVLV 
S YATLDDMEKPPYF P PNLARATQGRD YI P ASGFALFGANQS EAT I AI S ILDDDEPERSES VF IELLN 
STLVAKVQSRSIPNSPRLGPKVETIAQLIIIANDDAFGTLQLSAPIVRVAENHVGPIINVTRTGGAF 
ADVSVKFKAVPI TA I AGEDY S I ASSDWLLEGETSKAVPI YVIND I YPEL EES FL VQLMNETTGGAR 
LGALTEAVIIIEASDDPYGLFGFQITKLIVEEPEFNSVKVNLPIIRNSGTLGNVTVQWVATINGQLA 
TGDLRVVSGNVTF APGET IQTLLLEVL ADDVPE I EEVI QVQLTDASGGGT IGLDRI ANI 1 1 PANDDP 
YGTVAFAQMVYRVQEPLERS SC ANITVRRSGGHFGRLLLFYSTSDIDWALAMEEGQDLLS YYES PI 
QGVPDPLWRTWMNVSAVGEPLYTCATLCLKEQACS AFSFFSASEGPQCFWMTSWI S PAVNNSDFWTY 
RKNMTRVASLFSGQAVAG SDYEPVTRQWAIMQEGDEF ANLTVS I LPDDFPEMDES FL I SLLEVHLMN 
I SASLKNQPTIGQPNI STWIALNGDAFGVFVI YNI S PNTSEDGLFVEVQEQPQTLVELMI HRTGGS 
LGQVAVEWRWGGTATEGLDFIGAGEI LTFAEGETKKTVILTILDDSEPEDDES I IVSLVYTEGGSR 
I L P S SDT VRVNI LANDNVAG IVS FQTASR SVI GHEGE I L QFHVI RTF PGRGNVTVNWK 1 1 G QNLELN 
FANFSGQLFFPEGSLNTTLFVHLLDDNI PEEKEVYQV I LYDVRTQGVPPAG IALLDAQGYAAVLTVE 
ASDEPHGVLNFALSSRFVLLQEANITIQLFINREFGSLGAINVTYTTVPGMLSLKNQTVGI^ 
DFVPIIGFLILEEGETAAAINITILEDDVPELEEYFLVNLTYVGLTMAASTSFPPRLGMRGFLFVSF 
CSLQMK 



Sequence comparison of the above protein sequences yields the following sequence 
relationships shown in Table 39B, 

5 



Table 39B. Comparison of NOV39a against NOV39b through NOV39d. 


Protein Sequence 


NOV39a Residues/ 
Match Residues 


Identities/ 

Similarities for the Matched Region 


NOV39b 


1..2741 
1..2741 


2684/2741 (97%) 
2685/2741 (97%) 


NOV39c 


1..1456 
1..1456 


1442/1456(99%) 
1443/1456(99%) 


NOV39d 


1..2753 
1..2753 


2700/2753 (98%) 
2700/2753 (98%) 



Further analysis of the NOV39a protein yielded the following properties shown in 
10 Table 39C. 



Table 39C. Protein Sequence Properties NOV39a 
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PSort analysis: 


11 ' yTprs U< K / '.X «d? / 

0.5050 probability located in cytoplasm; 0.3836 probability located in 

microbody (peroxisome); 0.1851 probability located in lysosome (lumen); 

0.1000 probability located in mitochondrial matrix space 


SignalP analysis: 


No Known Signal Sequence Predicted 



A search of the NOV39a protein against the Geneseq database, a proprietary 
database that contains sequences published in patents and patent publication, yielded 
5 several homologous proteins shown in Table 39D. 



Table 39D. Geneseq Results for NOV39a 


Geneseq 
Identifier 


Protein/Organism/Length 
[Patent #, Date] 


NOV39a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for the 
Matched Region 


Expect 
Value 


AAE10925 


Human monogenic 
audiogenic 
seizure-susceptible- 1 
(massl) protein - Homo 
sapiens, 2777 aa. 
[WO200165927-A1, 
13-SEP-2001] 


1..2753 
1..2777 


2736/2778 (98%) 
2739/2778 (98%) 


0.0 


AAE10924 


Mouse monogenic 
audiogenic 
seizure-susceptible- 1 
(massl) protein - Mus 
musculus, 2780 aa. 
[WO200165927-A1, 
13-SEP-2001] 


1..2739 
1..2761 


2295/2762 (83%) 
2516/2762(91%) 


0.0 


AAE10949 


Mouse massl protein 
mutant (7009deltaG) - Mus 
musculus, 2071 aa. 
[WO200165927-A1, 
13-SEP-2001] 


1..2049 
1..2071 


1710/2072 (82%) 
1878/2072(90%) 


0.0 


ABG61545 


Human transporter and ion 
channel, TRICH15, bicyte 
E>7476089CD1-Homo 
sapiens, 759 aa. 
[WO200240541-A2, 
23-MAY-2002] 


1531..2288 
1..746 


740/758 (97%) 
740/758 (97%) 


0.0 


ABB05663 


Human signal transduction 
protein clone amy2_10p7 - 
Homo sapiens, 1615 aa. 
[WO200198454-A2, 
27-DEC-2001] 


2232..2741 
9..518 


506/510 (99%) 
507/510(99%) 


0.0 
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In a BLAST search of public sequence datbases, the NOV39a protein was found to 
have homology to the proteins shown in the BLASTP data in Table 39E. 



Table 39E. Public BLASTP Results for NOV39a 


Protein 

Accession 

Number 


Protein/Organism/Length 


NOV39a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for the 
Matched Portion 


Expect 
Value 


Q8WXG9 


Very large G protein-coupled 
receptor lb - Homo sapiens 
(Human), 6307 aa. 


1..2741 
180..2945 


2721/2766 (98%) 
2723/2766 (98%) 


0.0 


Q91ZS2 


MASS1 - Mus musculus 
(Mouse), 2780 aa. 


1..2739 
1..2761 


2293/2762 (83%) 
2515/2762(91%) 


0.0 


Q8VHN7 


Very large G protein-coupled 
receptor 1 - Mus musculus 
(Mouse), 6298 aa. 


1..2741 
179..2941 


2293/2764 (82%) 
2514/2764 (89%) 


0.0 


Q91ZS1 


MASS1.2 - Mus musculus 
(Mouse), 2238 aa. 


563..2739 
29..2219 


1838/2192 (83%) 
2004/2192 (90%) 


0.0 


Q8TF58 


KIAA1943 protein - Homo 
sapiens (Human), 1054 aa 
(fragment). 


234.. 1273 
1..1050 


1037/1050 (98%) 
1037/1050 (98%) 


0.0 



Example 40. 

The NOV40 clone was analyzed, and the nucleotide and encoded polypeptide 
sequences are shown in Table 40A. 



Table 40A. NOV40 Sequence Analysis 




SEQIDNO:167 |2833bp | 


NOV40a, 
CG15 1014-01 
DNA Sequence 


C AAAGATCC AGTTTGG AAATG AG AGAGGAC TAGC ATG AC AC ATTGGCTC C ACCATTGATATC TC CC A 


GAGGTACAGAAACAGGATTCATGAAGATGTTGACAAGACTGCAAGTTCTTACCTTAGCTTTGTTTTC 
AAAGGGATTTTTACTCTCTTTAGGGGACCATAACTTTCTAAGGAGAGAGATTAAAATAGAAGGTGAC 
CTTGTTTTAGGGGGCCTGTTTCCTATTAACGAAAAAGGCACTGGAACTGAAGAATGTGGGCGAATCA 




TTACTTGCTACCAGGAGTGAAGTTGGGTGTTCACATTTTGGATACATGTTCAAGGGATACCTATGCA 
TTGGAGCAATCACTGGAGTTTGTCAGGGCATCTTTGACAAAA 

CTGATGGATCCTATGCCATTCAAGAAAACATCCC^CTTCTCATTGCAGGGGTCATTGGTGGCTCTTA 
TAGCAGTGTTTCCATAC AGGTGGCAAACCTGCTGCGGCTC TTCCAGATCCC TC AGATC AGCTACGCA 
TCC^CCAGCGCCAAACTCAGTGATAAGTCGCGCTATGATTACTTTGCCAGGACCGTGCCCCCCGACT 
TCTACCAGGCCAAAGCCATGGCTGAGATCTTGCGCTTCTTCAACTGGACCTACGTGTCCACAGTAGC 
CTCCGAGGGTGATTACGGGGAGACAGGGATCGAGGCCTTCGAGCAGGAAGCCCGCCTGCGCAACATC 
TGCATCGCTACGGCGGAGAAGGTGGGCCGCTCCAACATCCGCAAGTCCTACGACAGCGTGATCCGAG 




CATTGCAGCCGCCAGCCGCGCCAATGCCTCCTTCACCTGGGTGGCCAGCGACGGCTGGGGCGCGCAG 
GAGAGCATCATCAAGGGCAGCGAGCATGTGGCCTACGGCGCCATCACCCTGGAGCTGGCCTCCCAGC 
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CTGTCCGCCAGTTCGACCGCTACTTCCAGAGCCTCA^ 

CCGGGACTTCTGGGAGCAAAAGTTTCAGTGCAGCC TCC AGAAC AAACGCAACCACAGGCGCGTC TGC 
GACAAGCACCTGGCCATCGACAGCAGCAACTACGAGCAAGAGTCCAAGATCATGTTTGTGGTGAACG 
CGGTGTATGCCATGGCCCACGCTTTGCACAAAATGCAGCGCACCCTCTGTCCCAACACTACCAAGCT 
TTGTGATGCTATGAAGATCCTGGATGGGAAGAAGTTGTACAAGGATTACTTGCTGAAAATCAACTTC 
ACGGCTCCATTCAACCCAAATAAAGATGCAGATAGCATAGTCAAGTTTGACACTTTTGGAGATGGAA 
TGGGGCGATACAACGTGTTCAATTTCCAAAATGTAGGTGGAAAGTATTCCTACTTGAAAGTTGGTCA 
CTGGGCAGAAACCTTATCGCTAGATGTCAACTCTATCCACTGGTCCCGGAACTCAGTCCCCACTTCC 
CAGTGCAGCGACCCCTGTGCCCCCAATGAAATGAAGAATATGCAACCAGGGGATGTCTGCTGCTGGA 
TTTGCATCCCCTGTGAACCCTACGAATACCTGGCTGATGAGTTTACCTGTATGGATTGTGGGTCTGG 
ACAGTGGCCCACTGCAGACCTAACTGGATGCTATGACCTTCCTGAGGACTACATCAGGTGGGAAGAC 
GCCTGGGCCATTGGCCCAGTCACCATTGCCTGTCTGGGTTTTATGTGTACATGCATGGTTGTAACTG 
TTTTTATCAAGCACAACAACACACCCTTGGTCAAAGCATCGGGCCGAGAACTCTGCTACATCTTATT 
GTTTGGGGTTGGCCTGTC ATAC TGCATGAC ATTCTTCTTC ATTGCCAAG CC ATCACC AGTCATCTGT 
GCATTGCGCCGACTCGGGCTGGGGAGTTCCTTCGCTATCTGTTACTCAGCCCTGCTGACCAAGACAA 
ACTGC ATTGCC CGCATC TTCG ATGGGGTC AAGAATGGCGCTCAGAGGCC AAAATTCATC AGCCCC AG 
TTCTCAGGTTTTCATCTGCCTGGGTCTGATCCTG^TGCAAATTGTGATGGTGTCTGTGTGGCTCATC 
CTGGAGGCCCCAGGCACCAGGAGGTATACCCTTACAGAGAAGCGGGAAACAGTCATCCTAAAATGCA 
ATGTCAAAGATTCCAGCATGTTGATCTCTCTTACCTACGATGTGATCCTGGTGATCTTATGCACTGT 
GTACGC CTTC AAAACGCGGAAGTGCCC AGAAAATTTCAACGAAGC TAAGTTCATAGGTTTTACCATG 
TACACCACGTGCATCATCTGGTTGGCCTTCCTCCCTATATTTTATGTGACATCAAGTGACTACAGAC 
CTCTGCAAGCACGTATGTGTCAACGGTGTGCAATGGGCGGGAAGTCCTCGACTCCACCACCTCATCT 
CTGTGATTGTGAAT TGC AGTTC AGTTC TTGTGTTTTT AGAC TGTTAGAC AAAAGTGCTC ACGTGC AG 
nTPrA^AATRTnnAAArAGAGr-AAAAGAACAACCCTAGTACCTTTTTTTAG AAACAGTACGATAAAT 
TATTTTTGAGGACTGTATATAG TGATGTGCTAGAACTTTCTAGGCTGAGTCTAGTGCCCCTATTATT 
AACAATTCCCCCAGAACATGGAAATAACCATTGTTTACAGAGCTGAGCATTGGTGACAGGGTCTGAC 
ATGGTCAGTCTACTTCAAG 

ORF Start: ATG at 88 | )ORF Stop: TAG at 2662 





SEQ ID NO: 168 |858 aa |MW at 96975.6kD 


NOV40a, 
CG15 1014-01 
Protein Sequence 


MKMLTRLQVLTLAIiFSKGFLLSLGDHNFLRREIKIEGDLVLGGLFPINEKGTGTEECGRINEDRGIQ 
RLEAMLFAIDEINKDDYLLPGVKLGVHILOTCSR^^ 

QENI PLLI AGVIGGS YS SVS IQVANLLRLFQ I PQI SYASTS AKLSDKSRYDYFARTVPPDFYQAKAM 
AEILRFFNWTYVSTVASEGDYGETGIEAFEQEARLRNICIATAEKVGRSNIRKSYDSVIRELLQKPN 
ARVWLFMRSDDSRELI AAASRANASFTWVASDGWGAQES I IKGSEHVAYGAITLELASQFVRQFDR 
YFQ SLNP YNNHRNPWFRDFWEQKFQCSLQNKRNHRRVCDKHI^ 

ALHKMQRTLC PNTTKLCDAMKI LDGKKLYKDYLLKTNFTAPFNPNKDADS IVKFDTFGDGMGRYNVF 
NFQNVGGKYSYLKVGHWAETLSLDVNS IHWSRNSVPTSQCSDPCAPNEMKNMQPGDVCCWIC IPCEP 
YEYLADEFTCMDCGSGQWPTADLTGCYDLPEDYIRW^ 

TPLVKASGRELCYILLFGVGLSYCMTFFFIAKPSPVICALRRI^IXSS SFAICYSAIJjTKTNC IARIF 
DGVKNGAQRPKF I S PSSQWICLGLILVQIVMVSVV^ILEAPGTRRYTLTEKRETVILKCNVKDSSM 
LISLTYDVILVILCTVYAFKTRKCPENFNEAKFIGFTMYTTCIIWIJ^LPIFYVTSSDYRPLQARMC 
QRCAMGGKSSTPPPHLCDCELQFSSCVFRLLDKSAHVQLQNMETEQKNNPSTFF 






SEQ ID NO: 169 |l758bp | 


NOV40b, 
CG15 1014-02 
DNA Sequence 


CAAAGATCCAGTTTG<3AAATGAGAGAGGACTAGCATGACACATTGGCTCCACCATTGATATCTCCCA 


GAGGTACAGAAACAGGATTCATGAAGATGTTGACAAGACTGCAAGTTCTTACCTTAGCTTTGTTTTC 
AAAGGGATTTTTACTCTCTTTAGGGGACCATAACTTTCTAAGGAGAGAGATTAAAATAGAAGGTGAC 
CTTGTTTTAGGGGGCCTGTTTCCTATTAACGAAAAAGGCACTGGAACTGAAGAATGTGGGCGAATCA 
ATGAAGACCGAGGGATTCAACGCCTGGAAGCCATGTTGTTTGCTATTGATGAAATCAACAAAGATGA 
TTAC TTG CTACC AGG AGTG AAGTTGGGTGTTCAC ATT TTGG ATACATGTTCAAGGGATACCTATGCA 
TTGGAG^AATCACTGGAGTTTGTCAGGGCATCTTTGACAAAAGTGGATGAAGCTGAGTATATGTGTC 
CTGATGGATCCTATGCCATTCAAGAAAACATCCCACTTCTCATTGCAGGGGTCATTGGTGGCTCTTA 
TAGCAGTGTTTCCATACAGGTGGCAAACCTGCTGCGGCTCTTCCAGATCCCTCAGATCAGCTACGCA 
TCCACCAGCGCCAAACTCAGTGATAAGTCGCGCTATGATTACTTTGCCAGGACCGTGCCCCCCGACT 
TCTACCAGGCCAAAGCCATGGCTGAGATCTTGCGCTTCTTCAACTGGACCTACGTGTCCACAGTAGC 
CTCCGAGGGTGATTACGGGGAGACAGGGATCGAGGCCTTCGAGCAGGAAGCCCGCCTGCGCAACATC 
TGCATCGCTACGGCGGAGAAGGTGGGCCGCTCCAACATCCGCAAGTCCTACGACAGCGTGATCCGAG 
AACTGTTGCAGAAGCCCAACGCGCGCGTCGTGGTCCTCTTCATGCGCAGCGACGACTCGCGGGAGCT 
CATTGCAGCCGCCAGCCGCGCCAATGCCTCCTTCACCTGGGTGGCCAGCGACGGCTGGGGCGCGCAG 
GAGAGC ATC ATCAAGGGCAGCGAGCATGTGGCCTACGGCGC C ATCACC CTGGAGCTGGCCTCCCAGC 
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r , TY5T , rPGr , CAGTTCGACCGCTACTTCCAGAGCCTC.?i5fc^ 

CCGGGACTTCTGGGAGCAAAAGTTTCAGTGCAGCCTCCAGAACAAACGCAACCACAGGCGCGTCTGC 
GACAAGCACCTGGCCATCGACAGCAGCAACTACGAGCAAGAGTCCAAGATCATGTTTGTGGTGAACG 
CGGTGTATGCC ATGGCCC ACGC TTTGC ACAAAATGCAGCGCACCCTC TGTCCCAAC ACTACC AAGCT 
TTGTGATGCTATGAAGATCCTGGATGGGAAGAAGTTGTACAAGGATTACTTGCTGAAAATCAACTTC 
ACGGGTGCAGACGACAACCATGTGCATCTCCGTCAGCCTGAGTGGCTTTGTGGTCTTGGGCTGTTTG 
TTTGCACCCAAGGTTCACATCATCCTGTTTCAACCCCAGAAGAATGTTGTCACACACAGACTGCACC 
TCAACAGGTTCAGTGTCAGTGGAACTGGGACCACATACTCTCAGTCCTCTGCAAGCACGTATGTGCC 
AACGGTGTGCAATGGGCGGG AAGTCCTCGACTC CACC ACCTCATCTC TGTG ATTGTG AATTGC AGTT 
C AGTTCTTGTGTTT'PTAGAC TGTTAGACAAAAGTGCTC ACGTGC AGCTCC AG AAT ATGG AAAC AG AG 


CAAAAGAACAACCCTA 




ORF Start: ATG at 88 j |ORF Stop: TAG at 1 699 





SEQ ID NO: 170 |537 aa |MW at 60801.8kD 


NOV40b, 
CG151014-02 
Protein Sequence 


mkmltrlqvltlalfskgfllslgdhnflrreikiegd^^ 
rleamlfaideinkddyllpg\ftcix3vhildt^ 

qeniplliagviggsyssvsiqvanllrlfqipqisyastsaklsdksrydyfartvppdfyqakam 
aeilrffnwtwstvasegdygetgieafeqearlrniciataekvgrsnirksydsvirellqkpn 
arvvvlfmrsddsreli aaasranasftb^atasdgwgaqes 1 1 kg s ehvaygai tlelasqpvrqfdr 
yfqslnpynnhrnpwrdfweqkfqcslqnkrnh^ 
alhkmqrtlcpottklcdamkildgkklykdyllkinftc 

hpvstpeecchtqtapqqvqcqwi^hilsvix:khvcangvqwagsprlhhlisviwcssvlvfld 
c 





SEQ ID NO: 171 |l758bp j 


NOV40c, 
CG151014-03 
DNA Sequence 


ccttgatccagtttggaaatgagagaggactagcatgacacattggctccaccattgatatctccca 


gaggtacagaaa(^ggattcatgaagatgttgacaagactgcaagttcttaccttagctttgttttc 
aaagggatttttactctctttaggggaccataactttctaaggagagagattaaaatagaaggtgac 
cttgttttagggggcctgtttcctattaacgaaaaaggcactggaactgaagaatgtgggcgaatca 
atgaagaccgagggattcaacgcctggaagccatgttgtttgctattgatgaaatcaacaaagatga 
ttacttgctaccaggagtgaagttgggtgttcacattttggatacatgttcaagggatacctatgca 
ttggagcaaox:actggagtttgtcagggcatctttgacaaaagtggatgaagctgagtatatgtgtc 
ctgatggatcctatgccattcaagaaaacatcccacttctcattgcaggggtcattggtggctctta 
tagcagtgtttccatacaggtggcaaacctgctgcggctcttccagatccctcagatcagctacgca 
tccaccagcgccaaactcagtgataagtcgcgctatgattactttgccaggaccgtgccccccgact 
tctaccaggccaaagccatggctgagatcttgcgcttcttcaactggacctacgtgtccacagtagc 
ctccgagggtgattacggggagacagggatcgaggccttcgagcaggaagcccgcctgcgcaacatc 
tgcatcgctacggcggagaaggtgggccgctccaacatccgcaagtcctacgacagcgtgatccgag 

AACTGTTGCAGAAGCCCAACGCGCGCGTCGTGGTCCTCTTCATGCGCAGCGACGACTCGCGGGAGCT 
CATTGCAGCCGCCAGCCGCGCCAATGCCTCCTTCACCTGGGTGGCCAGCGACGGCTGGGGCGCGCAG 
GAGAGCATCATCAAGGGCAGCGAGCATGTGGCCTACGGCGCCATCACCCTGGAGCTGGCCTCCCAGC 
CTGTCCGCCAGTTCGACCGCTACTTCCAGAGCCTCAACCCCTACAACAACCACCGCAACCCCTGGTT 
CCGGGACTTCTGGGAGCAAAAGTTTCAGTGCAGCCTCCAGAACAAACGCAACCACAGGCGCGTCTGC 
GACAAGCACCTGGCCATCGACAGCAGCAACTACGAGCAAGAGTCCAAGATCATGTTTGTGGTGAACG 
CGGTGTATGCCATGGCCCACGCTTTGCACAAAATGCAGCGCACCCTCTGTCCCAACACTACCAAGCT 
TTGTGATGCTATGAAGATCCTGGATGGGAAGAAGTTGTACAAGGATTACTTGCTGAAAATCAACTTC 
ACGGGTGCAGACGAC AAC CATGTGCATCTCCGTCAGCC TG AGTGGCTTTGTGGTCTTGGGCTGTTTG 
TTTGCACCCAAGGTTCACATCATCCTGTTTCAACCCCAGAAGAATGTTGTCACACACAGACTGCACC 
TCAACAGGTTCAGTGTCAGTGGAACTGGGACCACATACTCTCAGTCCTCTGCAAGCACGTATGTGCC 
AACGGTGTGCAATGGGCGGGAAGTCCTCGACTCCACCACCTCATCTCTGTGATTGTGAATTGCAGTT 
C ART TCTTGTGTTTTTAGACTGTTAGACAAAAGTGCTC ACGTGC AGC TCC AGAATATGGAAAC AG AG 


CAAAAGAACAACCCTA 




ORF Start: ATG at 88 (ORF Stop: TAG at 1699 
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SEQ ID NO: 172 


|537aa |MW at 60801.8kD 


NOV40c, 
CG151014-03 
Protein Sequence 


MKMLTRLQVLTLALFSKGPLLSLGDHNFLRREIKIEGDLVLGGLFPINEKGTGTEECGRINEDRGIQ 
RLEAMLFAIDEINKDDYLLPGVKLGVHILDTCSRIOT 

QENIPLLIAGVIGGSYSSVS IQVANLLRLFQI PQI SYASTSAKLSDKSRYDYFARTVPPDFYQAKAM 
AEILRFFNWTYVSTVASEGDYGETGI EAFEQEARLRNIC I ATAEKVGRSNIRKSYDSVI RELLQKPN 
ARVWLFMRSDDSREL I AAASRANASFTWVAS DGWGAQES I IKGSEHVAYG AI TLEL AS QP VRQFDR 
YFQSLNPYNNHRNPWFRDFWEQKFQCSLQNKRNHRRV^ 

ALHKMQRTLC PNTTKLCDAMK I LDGKKL YKDYLLKIOTTGADDNHVHLRQPEWLCGLGLFVCTQGSH 
HPVSTPEECCHTQTA^QQVQCQWNVTOHILSVLCKHVCANGVQWAGSPRLHHLI SVIVNC SSVLVFLD 
C 



5 

Sequence comparison of the above protein sequences yields the following sequence 
relationships shown in Table 40B. 



Table 40B. Comparison of NOV40a against NOV40b and NOV40c. 


Protein Sequence 


NOV40a Residues/ 


Identities/ 


Match Residues 


Similarities for the Matched Region 


NOV40b 


1..441 


409/441 (92%) 




1..441 


409/441 (92%) 


NOV40c 


1..441 


409/441 (92%) 




1..441 


409/441 (92%) 



10 

Further analysis of the NOV40a protein yielded the following properties shown in 
Table 40C. 

15 



Table 40C. Protein Sequence Properties NOV40a 


PSort analysis: 


0.6400 probability located in plasma membrane; 04600 probability located in 
Golgi body; 0.3700 probability located in endoplasmic reticulum (membrane); 
0.1000 probability located in endoplasmic reticulum (lumen) 


SignaiP analysis: 


Cleavage site between residues 25 and 26 



A search of the NOV40a protein against the Geneseq database, a proprietary 
database that contains sequences published in patents and patent publication, yielded 
20 several homologous proteins shown in Table 40D. 



Table 40D. Geneseq Results for NOV40a 
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Geneseq 
Identifier 


Protein/Organism/Length 
[Patent #, Date] 


NOV40a 
Residues/ 
Match 
Residues 


KT/USOB 
Identities/ 

Similarities for 

ttftp TVfntrlipH 

vUC iTlatlUCU 

Region 


Expect 

V alUC 


AAEI5990 


Human glutamate receptor, 
metabotrophic 3 (GRM3) 
protein - Homo sapiens, 877 
aa. [WO200196350-A2, 
20-DEC-2001] 


3..811 
1..809 


797/809 (98%) 
799/809(98%) 


0.0 


AAR82657 


Human mGluR3 - Homo 
sapiens, 877 aa. 
[WO9522609-A2, 
24-AUG-1995] 


3..811 
1..809 


797/809 (98%) 
799/809 (98%) 


0.0 


AAM23698 


Human EST encoded protein 
SEQ ID NO: 1223 -Homo 
sapiens, 857 aa. 
[WO200154477-A2, 
02-AUG-2001] 


1..811 
1-811 


796/811 (98%) 
798/811 (98%) 


0.0 


AAR64252 


Human mGluR3 - Homo 
sapiens, 879 aa. 
[W09429449-A, 
22-DEC-1994] 


1..811 
1-811 


796/811 (98%) 
799/811 (98%) 


0.0 


AAO15105 


Human 

ph2SPMGluR3-CaR*AAA* 
Gqi5 fusion construct protein 
sequence - Chimeric - Homo 
sapiens, 1402 aa. 
[WO200229033-A2, 
ll-APR-2002] 


21-811 

17-807 j 


777/791 (98%) 
781/791 (98%) 


0.0 



In a BLAST search of public sequence datbases, the NOV40a protein was found to 
have homology to the proteins shown in the BLASTP data in Table 40E. 

5 



Table 40E. Public BLASTP Results for NOV40a 


Protein 

Accession 

Number 


Protein/Organism/Length 


NOV40a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for 
the Matched 
Portion 


Expect 
Value 


Q14832 


Metabotropic glutamate 
receptor 3 precursor 
(mGluR3) - Homo sapiens 
(Human), 877 aa. 


3..811 
1..809 


797/809(98%) 
799/809 (98%) 


0.0 
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Q8TBH9 


Glutamate receptor, 
metabotropic 3 - Homo 
sapiens (Human), 877 aa. 


3..811 r 
L.809 


797/809 (98%) 




Q9QYS2 


Metabotropic glutamate 
receptor 3 protein - Mus 
musculus (Mouse), 879 aa. 


1..811 
1..811 


773/811 (95%) 
792/811(97%) 


0.0 


P31422 


Metabotropic glutamate 
receptor 3 precursor - Rattus 
norvegicus (Rat), 879 aa. 


1..811 
1..811 


772/811(95%) 
790/811(97%) 


0.0 


JC7160 


metabotropic glutamate 
receptor subtype 3 precursor - 
mouse, 879 aa. 


1..811 
1.811 


771/811(95%) 
790/811(97%) 


0.0 



PFam analysis predicts that the NOV40a protein contains the domains shown in the 
Table 40F. 



Table 40F. Domain Analysis of NOV40a 


Pfam Domain 


NO V40a Match 
Region 


Identities/ 
Similarities 

for the Matched Region 


Expect Value 


ANF_receptor 


S8..489 


194/473 (41%) 
399/473 (84%) 


3.2e-173 


7tm_3 


576..820 


109/283 (39%) 
217/283 (77%) 


3.1e-104 



Example 41. 

The NOV41 clone was analyzed, and the nucleotide and encoded polypeptide 
sequences are shown in Table 41 A. 



Table 41 A. NOV41 Sequence Analysis 




SEQIDNO: 173 


880 bp | 


NOV41a, 
CG15 1297-01 
DNA Sequence 


GAATTCTGATGTGCTTCAGTGCACAGAACAGTAACAGATGAGCTGCTTTTGGGGAGAGCTTGAGTAC 


TCAGTCGGAGCATGATCATGGGGTCTAGTGCCACAGAGATTGAAGAATTGGAAAACACCACTTTTAA 
GTATC TT ACAGGAG AAC AG ACTGAAAAAATGTGGCAGC GC C TGAAAGGAATACTAAGATGCTTGGTG 
AAG C AGC TGGAAAG AGGTGATGTTAACGTCGTCGACTTAAAGAAGAATAT TGAAT ATGC GGCATCTG 
TGC TGG AAGC AGTTTATATC GATGAAACAAG AAGAC TTCTGG ATAC TG AAGATG AG CTC AGTGACAT 
TC AG AC TG AC TCAGTCCC ATC TG AAG TC C GG GAG TGGTTGG C TTC TACC TTTACACGGAAAATGGGG 
ATG ACAAAAAAG AAACC TGAGGAAAAACC AAAATTTCGGAG CATTGTGCATGC TGTTC AAGC TGGAA 
TTTTTGTGGAAAGAATGTACCGAAAAACATTTTCTCTTCTGACAGACTCAACAGAGAAAATTGTTAT 
TCCTCTTATAGAGGAAGCCTCAAAAGCCGAAACTTCTTCCTATGTGGCAAGCAGCTCAACCACCATT 
GTGGGGTTACACATTGCTGATGCACTAAGACGATCAAATACAAAAGGCTCCATGAGTGATGGGTCCT 
ATTCCCCAGACTACTCCCTTGCAGCAGTGGACCTGAAGAGTTTCAAGAACAACCTGGTGGACATCAT 
TCAGCAGAACAAAGAGAGGTGGAAAGAGTTAGCTGCACAAGAAGCAAGAACCAGTTCACAGAAGTGT 
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GAGTTTATTCATCAGTAAACACCTTTAAGTAAAACCfrcto 
AAGACTTGG 




ORF Start: ATG at 85 |ORF Stop: TAA at 820 





SEQ ID NO: 174 |245 aa 


MW at 27787.2kD 


NOV41a, 
CG151297-01 
Protein Sequence 


MGSSATEIEELENTTFKYLTGEQTEKMWQRLKGIIjRCLVKQLERGDVNVVDLKKNI 
IDETRRLLDTEDELSDIQTOSVPSEVRDWLASTFTR 

YRKTFSLLTDSTEKIVIPLIEEASKAETSSYVASSSTTIVGIiHIADALRRSNTKGSMSDGSYSPDYS 
L AAVDLKS FKNNLVDI IQQNKERWKELAAQEARTSSQKCEFIHQ 





SEQ ID NO: 175 


1817 bp 1 


NOV41b, 
CG151297-02 
DNA Sequence 


TCAGTGCACAGAACAGTAACAGATGAGCTGCTTTTGGGGAGAGCTTGAGTACTCAGTCGGTCAGTAG 


TACAGTAGCAGGCTCACATGTACGGATTGTTCTTGTGAGGAGCATCATCATGGGGTCTAGTGCCACA 


GAGATTGAAGAATTGGAAAACACCACTTTTAAGTATCTTACAGGAGAACAGACTGAAAAAATGTGGC 
AGCGCC TG AAAGG AAT AC T AAGATGCTTGG TG AAGC AG C TGG AAAG AGGTG ATG TT AACGTCGTC G A 
CTTAAAGAAGAATATTGAATATGCGGCATCTGTGCTGGAAGCAGTTTATATCGATGAAACAAGAAGA 
CTTCTGGATACTGAAGATGAGCTCAGTGACATTCAGACTGACTCAGTCCCATCTGAAGTCCGGGACT 
GGTTGGCTTCTACCTTTACACGGAAAATGGGGATGACAAAAAAGAAACCTGAGGAAAAACCAAAATT 
TCGGAGCATTGTGCATGCTGTTCAAGCTGGAATTTTTGTGGAAAGAATGTACCGAAAAACATATCAT 
ATGGTTGGTTTGGCATATCCAGCAGCTGTCATCGTAACATTAAAGGATGTTGATAAATGGTCTTTCG 
ATGTATTTGCCCTAAATGAAGCAAGTGGAGAGCATAGTCTGAAGTTTATGATTTATGAACTGTTTAC 

cagatatgatcttatcaaccgtttcaagattcctgttox:ttgcctaatcacctttgcagaagcttta 
gaagttggttacggcaagtacaaaaatccatatcacaatttgattcatgcagctgatgtcactcaaa 
ctgtgcattacataatgcttcatacaggtatcatgcactggctcactgaactggaaattttagcaat 
ggtc tttg c tgc tg cc attcatgattatg ag catac aggg ac aac aaac aac tt tc ac attcagaca 
aggtcagatgttgccattttgtataatgatcgctctgtccttgagaatcaccacgtgagtgcagctt 
atcgacttatgcaagaagaagaaatgaatatcttgataaatttatccaaagatgactggagggatct 
tcggaacctagtgattgaaatggttttatctacagacatgtcaggtcacttccagcaaattaaaaat 
ataagaaacagtttgcagcagcctgaagggattgacagagccaaaaccatgtccctgattctccacg 
cagcagacatcagccacccagccaaatcctggaagctgcattatcggtggaccatggccctaatgga 
ggagtttttcctgcagggagataaagaagctgaattagggcttccattttccccactttgtgatcgg 
aagtcaaccatggtgg^ccagtcacaaataggtttcatcgatttcatagtagagccaacattttctc 
ttctgacagactcaacagagaaaattgttattcctcttatagaggaagcctcaaaagccgaaacttc 
ttcctatgtggcaagcagctcaaccaccattgtggggttacacattgctgatgcactaagacgatca 
aatacaaaaggctccatgagtgatgggtcctattccccagactactcccttgcagcagtggacctga 
agagtttcaagaacaacctggtggacatcattcagcagaacaaagagaggtggaaagagttagttgc 
acaagaagcaagaaccagttcacagaagtgtgagtttattcatcagtaaacacctttaagtaaaacc 
tcgtgcatggtggcagctctaatttgaccaaaagacttggagattttgattatgcttgctggatatc 




tattctgt 




ORF Start: ATG at 117 


| |ORF Stop: TAA at 1722 





SEQ ID NO: 176 |535aa 


MW at 61249.3kD 


NOV41b, 
CG151297-02 
Protein Sequence 


MGSSATEIEELENTTFKYLTGEQTEKMWQRLKGILRCLVKQLERGDVNVVDLKKNIEYAASVLEAVY 
IDETRRLLDTEDELSDIQTDSVPSEVRDWLASTFTRKMGMTKKKPEEKPKFRSIVHAV 

YRKTYHMVGLAYPAAV I vtlkdvdkws fuvf alneasgehslkfmi yelftrydlinrfki fvscl I 

TFAEALEVGYGKYKNP YHNLIHAAUVTQTVHYIMLHTG IMHWLTELE I LAMVFAAAIHDYEHTGTTN 
NFH IQTOSDVAI LYNDRSVLENHHVSAAYKLMQEEEMNIL INLSKDDWRDLRNI*VIEMVLSTDMSGH 
FQQIKNIRNSLQQPEGIDRAKTMSLILHAADISHPAKSWKLHYRWTMALMEEFFLQGDKEAELGLPF 
SPLCDRKSTMVAQSQIGFIDFIVEPTFSLLTDSTEKIVIPLIEEASKAETSSYVASSSTTIVGLHIA 
DALRRSNTKGSMSDGS YS PDYSLAAVDLKSFKNNLVD 1 1 QQNKERWKELVAQEARTSSQKCEFIHQ 
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Sequence comparison of the above protein sequerffceS-yifelds We-»H<5wfhg^bi}ueftde - 
relationships shown in Table 41B. 



Table 41B. Comparison of NOV41a against NOV41b. 


Protein Sequence 


NOV41a Residues/ 
Match Residues 


Identities/ 

Similarities for the Matched Region 


NOV41b 


1..159 
1..159 


141/159 (88%) 
148/159 (92%) 



5 



Further analysis of the NOV41 a protein yielded the following properties shown in 
Table 41C. 
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Table 41 C. Protein Sequence Properties NOV41a 


PSort analysis: 


0.8800 probability located in nucleus; 0. 1000 probability located in 
mitochondrial matrix space; 0.1000 probability located in lysosome (lumen); 
0.1000 probability located in plasma membrane 


SignalP analysis: 


No Known Signal Sequence Predicted 



A search of the NOV41a protein against the Geneseq database, a proprietary 
database that contains sequences published in patents and patent publication, yielded 
15 several homologous proteins shown in Table 4 ID. 



Table 41D. Geneseq Results for NOV41a 


Geneseq 
Identifier 


> 

Protein/Organism/Length 
[Patent*, Date] 


NOV41a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for 
the Matched 
Region 


Expect 
Value 


AAB85116 


Human 3', 5' cyclic 
nucleotide phosphodiesterase 
(HSPDElA3A)-Homo 
sapiens, 535 aa. 
[EP1097707-A1, 
09-MAY-2001] 


1..159 
1..159 


141/159 (88%) 
148/159 (92%) 


5e-75 


AAB85105 


Human 3\ 5 f cyclic 
nucleotide phosphodiesterase 
(HSPDElA3A)-Homo 
sapiens. 535 aa- 


1..159 
1..159 


141/159 (88%) 
148/159 (92%) 


5e-75 
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WO 03/029424 



PCT/US02/31373 



5 





IfcrlUy / /Uo-Al, 
09-MAY-2001] 


tp*ij 


^ ii r yauc 


£\* Jk. _J< . 


AAE07953 


Human phosphodiesterase 
(PDE) type 1 protein - Homo 
sapiens, 535 aa. 

TCP 100*77 10 A1 

09-MAY-2001] | 


1..159 
1..159 


141/159 (88%) 
148/159 {92%) 


5e-75 


AAE07917 


Human phosphodiesterase 
(PDE) type 1 protein - Homo 
sapiens, 535 aa. 

rem nn*7Ti o ai 
[JbrlUy//lo-AJ, 

09-MAY-2001] 


1..159 
1..159 


141/159 (88%) 
148/159 (92%) 


5e-75 


AAY80988 


Human 61 kD CaM-PDE 
(clone pHcam61-6N-7), SEQ 
ID NO:49 - Homo sapiens, 
535 aa. [US6015677-A, 
18-JAN-2000] 


1..159 
1..159 


141/159 (88%) 
148/159 (92%) 


5e-75 


In a BLAST search of public sequence datbases, the NOV41a protein was found to 
have homology to the proteins shown in the BLASTP data in Table 41E. 


Table 41E. Public BLASTP Results for NOV41a 


Protein 

Accession 

Number 


Protein/Organism/Length 


NOV41a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for 
the Matched 
Portion 


Expect 
Value 


AAH22480 


Hypothetical 62.3 kDa protein - 
Homo sapiens (Human), 545 
aa. 


1..159 
1..159 


141/159 (88%) 
148/159 (92%) 


le-74 


P54750 


Calcium/calmodulin-dependent 
3\5'-cyclic nucleotide 
phosphodiesterase 1A (EC 
3.1.4.17) (Cam-PDE 1A) (61 
kDa Cam-PDE) (hCam-1) - 
Homo sapiens (Human), 534 
aa. 


2..159 
1..158 


140/158 (88%) 
147/158 (92%) 


6e-74 


Q9EPR9 


Phosphodiesterase 1 A - Rattus 
norvegicus (Rat), 542 aa. 


1.-159 
1..159 


134/159 (84%) 
144/159 (90%) 


6e-71 


Q61481 


Calcium/calmodulin-dependent 
3 t ,5 , -cyclic nucleotide 
phosphodiesterase 1A (EC 
3.1.4.17) (Cam-PDE 1A) (61 
kDa Cam-PDE) -Mus 
musculus (Mouse), 565 aa. 


1..159 
21..179 


133/159 (83%) 
143/159 (89%) 


3e-70 
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WO 03/029424 



PCT/US02/31373 



A45334 




3',5-cyclic-nucleotide 
phosphodiesterase (EC 
3.1.4.17) 1A, 

calmodulin-dependent, 61K 
brain form - bovine, 530 aa. 



PFam analysis predicts that the NOV41a protein contains the domains shown in the 
Table 41F. 



Table 41F. Domain Analysis of NOV41a j 


Pfam Domain 


NOV41a Match Region 


Identities/ 
Similarities 
for the Matched 
Region 


Expect Value 


PDEase 


138..159 


9/49(18%) 
22/49 (45%) 


0.11 



Example 42. 

The NOV42 clone was analyzed, and the nucleotide and encoded polypeptide 
sequences are shown in Table 42A. 



Table 42A. NOV42 Sequence Analysis 




SEQIDNO:177 |512bp | 


NOV42a, 
CG151822-01 
DNA Sequence 


CCATGGCGGGCTGCGCGGCGCGGGCTCCGCCGGGCTCTGAGGCGCGTCTCAGCCTC6CCACCTTCCT 
GCTGGGCGCCTCGGTGCTCGCGCTGCCGCTGCTCACGCGCGCCGGCCTGCAGGGCCGCACCGGGCTG 
GCGCTCTACGTGGCCGGGCTCAACGCGCTGCTGCTGCTGCTCTATCGGCCGCCTCGCTACCAGATAG 
CCATCCGAGCTTGTTTCCTGGGGTTTGTGTTCGGCTGCGGCACGCTGCTAAGTTTTAGCCAGTCTTC 
TTGGAGTCACTTTGGCTQl^ACTGAAGCAGATTACCTGGCTCAGTGTCACAGGGCTGCTGATGGTGG'P 
CTTCGGAGAATGTCTGAGGAAGGCGGCCATGTNTACAGCTGGCTCCAATTTCAACCA 


AATGAAAAATCAGATACACATACTCTGGTGACCAGTGGAGTGTACGCTTGGTTTCGGCATCCTTCTT 


ACGTCGGGTGGTTTTACTGGAGTATTGGAACTCAGGTGATGCT 




ORF Start: ATG at 3 "J" |ORF Stop: TGA at 285 "™ 





SEQ ID NO: 178 |94 aa |mW at 9871.5kD 


NOV42a, 
CG151822-01 
Protein Sequence 


MAGCAARAP PG SEARLSLATFLLGASVLAL PLLTRAGLQGRTGLALYVAGLNALLLLL YRPPRYO I A 
IRACFLGFVFGCGTLLSFSQSSWSHFG 
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WO 03/029424 



PCT7US02/31373 



SEQIDNO: 179 



[3597 bp 



NOV42b, 
CG15 1822-02 
DNA Sequence 



GGCACGAGCGGCGCCGCCGCCCGCTAGTCCGCCGCCCGGCGCCA TGGCGGGCTGCGCGGCGCGGGCT 



CCGCCGGGCTCTGAGGCGCGTCTCAGCCTCGCCACCTTCCTGCTGGGCGCCTCGGTGCTCGCGCTGC 
CGCTGCTCACGCGCGCCGGCCTGCAGGGCCGCACCGGGCTGGCGCTCTACGTGGCCGGGCTCAACGC 
GCTGCTGCTGCTGCTCTATCGGCCGCCTCGCTACCAGATAGCCATCCGAGCTTGTTTCCTGGGGTTT 
GTGTTCGGCTGCGGCACGCTGCTAAGTTTTAGCCAGTCTTCTTGGAGTCACTTTGGCTGGTACATGT 
GCTCCCTGTCATTGTTCCACTATTCTGAATACTTGGTGACAGCAGTCAATAATCCCAAAAGTCTGTC 



TTC AC ACTTGAAAATATC TTTTGGCC AG AACTGAAGC AGATTACC TGGCTC AGTGTC ACAGGGC TGC 
TGATGGTGGTCTTCGGAGAATGTCTGAGGAAGGCGGCCATGTTTACAGCTGGCTCCAATTTCAACCA 
CGTGGTACAGAATGAAAAATCAGATACACATACTCTGGTGACCAGTGGAGTGTACGCTTGGTTTCGG 
CATCCTTCTTACGTCGGGTGGTTTTACTGGAGTATTGGAACTCAGGTGATGCTGTGTAACCCCATCT 
GCGGCGTC AGC TATGC CCTGAC AGTGTGGCG ATTC TTCCGCGATCGAAC AGAAGAAG AAGAAATCTC 
ACTAATTCACTTTTTTGGAGAGGAGTACCTGGAGTATA^GAAGAGGGTGCCCACGGGCCTGCCTTTC 
ATAAAGGGGGTCAAGGTGGACCTGTQ ACGGGCAGTGGCCCCGGTGACCTTGGGGCCTCCGACCCTGT 
GCAGCCTGGGACAAAACTGTTTCCGGTTGGCCGCTGCCACATGGATTTTCTTAATCGTTTTATGTCA 



TTAGTCACTCTTCTGGAATGTCACTCAAGACCAAGCGGTCAGAAGGCCTGAGOACCCAAGGCCCCAC 



TGGAGCAGTCTG TCCTTATGCCGAATCAAGGCGGAACATGGGTGAAAGACGAGTAAGGGGCAAATCA 



CAGCAATATTCCACAGCGCCCTCCAGAGTTACCTGGGGAGGACCGAGGCCACACGCCACTGCCCCCG 



AGGCC AGAGTGTAAGTAAAGGATAACC AGGAC TCGC TGGG AGAG ATGGAC TC TGTCC TC AGC AAC AC 



TCCACAGCAGAAAGGGGTAGCAGGTACCCCTTCTTATCAGCGGTAAAAATGCATTTACAACCTTTCA 



TTTAACCGAAAAACACAGACCGCTTTAACCTCTTTATTTCTGTCCCCCACTGCATGAACATCTATAC 



AATTTTAAAAATACTTCCTCATAGG ATGC T TTGGCC CTTCATCTATTTAATCATAGCTACATACCTA 



TTTTTTTATAAGTAGCAGTACACATTCAAAGGGGTATTCCTAGCTCAATGCTTGGTGTTCTAGTTCA 



ACTTTTATCCTGCAGCAAGTAAGCCTAGATAACTCTACACGATTTGGCTGAGTGGCTTTGTGTGACC 



GTGGCCCCAGGCCAAGGGGACCATGGCCCTGGCTGGCTTTCCCCCGGGGGTCTCAGCTCCTGTTGTC 



AGTGATAGGCG GCTCAAAGGAGCATCAGTTTCTTTTGATCCAAGAAGTGCTTACTGAATGCCTGCCC 



TGTGCGTGGCC TTAAAC ATTGAGAAGTGCTGC TC TCCGTTTATTTGGGATTTGATTCTCATTTTACC 



ATAGCTTATATTCTCAATTTCAATGCCAGTCTCAGAACTCTTGTTTTCTGTGTTCTGTTCTCAAAAT 



TACATTGTCCCTCATGTCATTTCAAACTGTTTTCCAAAGGGATTTGAGCATATACAACTACAAATCC 



AAGCAGATTGACTCTCAAAAATAATCTTAAATACTGCAAATAGTCCCAACTAAGATTCAGTCAGTAT 



GTTTGTTTTGCAAGTTTGGGAGAGTAAGTTGGCTTTGAGTCACACATCGAAGCTTTAAGAGGTGAGA 



CGCTGGCTTCATTCTGGACTAGACAGGAACTTGGCCTCAGCGTGAGATCCTGCCATGCAGTGTTGCG 



GTGGCACTGAAGAAGTGTGAATGTGAAGGCGGCGTCGGCGCGGGGCCAGAGCACCACTCTGCTGCCC 
CACCACGCGGCCTGTGAGGAGCCACTAAACCTTTCCGTGCCTAGACCTCCCCATCTGTGG7VATGGGG 



TCAATACCACCTACCTCACAGGGGTGTTGTGAGGACTGAGAAGAACAATGTCAAATGTTTTTAATAC 



tcagatgtgggagcgacatcaatgaaatctgtactgtatgaaagctacacaaaaatgggcagacatt 
tgg ttaattgtgcc agatacc taaaatgtatgttc ag aaaagc at tttatc aac tcagaaatatgac 
ttatttctagattx:atggcttaatgaattttttcattgttatatataccaaagaggcttacgggttc 



ATTGATTGGTTTGAAAACCAGACAGACGGCCGGGCACGCCTGTAATCCCAAAGTGCTGGGATTGCAG 
CGTGAGCCACCACGCCCAGCCAAGATGAACTCCTTAAGGACAGGATTTGGTj^AGTGATTGACTTCTT 
TTTAGTTCCATGATCTTGAGATTATTTTTAGCTTTATAAATTTAGCAGTGGCAGGGCCCGTGGAGAA 



TCAGGTTAATGAGGTAAAGGC TTTCTGGGTATTTGCTGC CAAGGCC ACATC ACC AATTTTCTCGATT 
TAAAAAACTGTCAAGAGATTTATTTTTCCATTGCAGGTTTTAAAGTGGAGATTCTGAAGTGGAAAAT 
AGGTACTGTCAGAACAAAGCTACCTGGAAACAGCATAGAGTGAAGCCTTTCGTGAGGGCTTGCAGGC 



CGCTGCTGAGTGGCAGTTTACAGAAGAGGTCGCGGGGTGAGCCTCTTAGCAGGACAGAAAACAAGGC 
AGCAGCGC ACCTGC C AC C CC TTC ACGAGCTGC TCC TTGAGC CTAAAAAGTAGGCTTTATTC ATCC C T 
TCTGTTC ATTTACC AAC C TGGGGGATTGATACGAC CGGGGAAAATGTTCCTAAACC AGGAAGC TGCG 



TTAGCCGATCAGGCTTTGTAAGATCTCGCCAACAGCTAGCTGCTTAGGAGTACCCCCACGATACGCA 
CAGCACACCACTGTCCCTTCACTGCACTTTCTTCCTGCCTTAGGTAGTTGGGCTTGCCCACCCTAGT 
TTGCTTTTGTAGTGGTTTGGCAAGGTTAGAAGGCCTCGGCCTCTCTGTCATGCTGGGAAGTGCCTAC 



TCTCTGGGCCACTGCTGCAGAGGCCGTGGCACTTGTCATGGGTTTGGAAGACCCAGCCATCTGCAGC 
AGAGGCAGCCTATCCCATTGCAAGGAGAGGAACTGAACGGAGTAATTATTCTACTCTTCTTTTTACA 
T AAATGTTTATTTAAATATTCT AAATTGGATTTTC ATTC AC AGATAC TGATTATTC TTTCC AGTTC T 



TAAATAAAAC TGC AC TTG ATT TC AC TC AAAAAAAAAAAAAAAAAAA 



ORF Start: ATG at 44 



ORF Stop: TGA at 896 





SEQ ID NO: 180 |284 aa |MW at 31937.7kD 


NOV42b, 
CG151822-02 
Protein Sequence 


MAGCAARAPPGSEARLSLATFLLGASVLALPLLTJtAGLQGRTGLAL 

IRACFLGFWGCGTLLSFSQSSWSHFGWYMCSLSLFHYSEYLWAVNNPKSLSIjDSFLl^SLErrV 
AALSSWLEFTLENIFWFBLKQITWLSVTGLLMVVFGECLRKAAMFTAGSNFNHWQNEKSOT 
SGVYAWFRH P SYVGWF YWS I GTQVMLCNP ICG VS YALTVWRFFRDRTEEEE I SLIHFFGEEYLEYKK 
RVPTGLPF IKGVKVDL 
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PC1YUS02/31373 



Sequence comparison of the above protein sequences yields the following sequence 
relationships shown in Table 42B. 
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Table 42B. Comparison of NOV42a against NOV42b. 


Protein Sequence 


NOV42a Residues/ 
Match Residues 


Identities/ 

Similarities for the Matched Region 


NOV42b 


1..94 
1..94 


67/94 (71%) 
67/94 (71%) 



Further analysis of the NOV42a protein yielded the following properties shown in 
Table 42C. 
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Table 42C. Protein Sequence Properties NOV42a 


PSort analysis: 


0.6000 probability located in plasma membrane; 0.4000 probability located in 
Golgi body; 0.3174 probability located in mitochondrial intermembrane space; 
0.3000 probability located in endoplasmic reticulum (membrane) 


SignalP analysis: 


Cleavage site between residues 37 and 38 



A search of the NOV42a protein against the Geneseq database, a proprietary 
15 database that contains sequences published in patents and patent publication, yielded 
several homologous proteins shown in Table 42D. 



Table 42D. Geneseq Results for NOV42a 


Geneseq 
Identifier 


Protein/Organism/Length 
[Patent #, Date] 


NOV42a 
Residues/ 
Match 
Residues 


Identifies/ 
Similarities for 
the Matched 
Region 


Expect 
Value 


AAY32299 


Farnesyl-directed cysteine 
carboxymethyltransferase 
STE14 - Homo sapiens, 284 
aa. [W09955878-A1, 
04-NOV-1999] 


1..94 
1..94 


94/94(100%) 
94/94(100%) 


2e-48 


AAW67730 


Human prenylcysteine 
carboxyl methyltransferase - 
Homn sanip.nR. 9.84 aa. 


1..94 
1..94 


94/94(100%) 
94/94(100%) 


2e-48 
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WO 03/029424 PCT/US02/31373 





rW09856924-Al 
17-DEC-1998] 


PH 


a ii /uayg. 


111 JJUSJ'.: 


AAB32052 


Human secreted protein 
BLAST search protein SEQ 
ID NO: 1 10 - Homo sapiens, 
223 aa. [WO200058350-A1, 
05-OCT-2000J 


12..94 
1..83 


83/83(100%) 
83/83 (100%) 


3e-41 


AAB32051 

i 
1 


Human secreted protein 
BLAST search protein SEQ 
ID NO: 109 - Homo sapiens, 
223 aa. [WO200058350-A1, 
05-OCT-2000} 


12..94 
1..83 


83/83 (100%) 
83/83 (100%) 


3e-41 


AAY32300 

i 


Mouse famesyl-directed 
cysteine 

carboxymethyltransferase - 
Mus musculus, 153 aa. 
[W09955878-A1, 
04-NOV-1999] 


5..94 
4..93 


82/90(91%) 
83/90(92%) 


2e-40 



In a BLAST search of public sequence datbases, the NOV42a protein was found to 
have homology to the proteins shown in the BLASTP data in Table 42E. 
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Table 42E. Public BLASTP Results for NOV42a 


Protein 

Accession 

Number 


Protein/Organism/Length 


NOV42a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for 
the Matched 
Portion 


Expect 
Value 


060725 


Protein-S isoprenylcysteine 
O-methyltransferase (EC 
2.1.1.100) (Isoprenylcysteine 
carboxylmethyltransferase) 
(Prenylcysteine carboxyl 
methyltransferase) (pcCMT) 
(Prenylated protein carboxyl 
methyltransferase) (PPMT) - 
Homo sapiens (Human), 284 
aa. 


1..94 
1..94 


94/94 (100%) 
94/94 (100%) 


5e-48 | 



263 



WO 03/029424 



PCT/US02/31373 



Q9EQK7 


Protein-S isoprenylcysteine 
O-methyltransferase (EC j 
2.1.1.100) (Isoprenylcysteine 
carboxylmethyltransf erase) 
(Prenylcysteine carboxyl 
methyltransferase) (pcCMT) 
(Prenylated protein carboxyl 
methyltransferase) (PPMT) - 
Mus musculus (Mouse), 283 
aa. 


ir- 

5..94 
4..93 


84/90(93%) 
85/90(94%) 


-It. „J» je . : 

2e-41 


012947 


Protein-S isoprenylcysteine 
O-methyltransferase (EC 
2.1.1.100) (Isoprenylcysteine 
carboxylmethyltransferase) 
(Prenylcysteine carboxyl 
methyltransferase) (pcCMT) 
(Prenylated protein carboxyl 
methyltransferase) (PPMT) 
(Farnesyl cysteine carboxyl 
methyltransferase) (FCMT) - 
Xehopus laevis (African 
clawed frog), 288 aa. 


13..94 
9..98 


49/90(54%) 
59/90(65%) 


2e-19 | 


Q9WVM4 


Protein-S isoprenylcysteine 
O-methyltransferase (EC 
2.1.1.100) (Isoprenylcysteine 
carboxylmethyltransferase) 
(Prenylcysteine carboxyl 
methyltransferase) (pcCMT) 
(Prenylated protein carboxyl 
methyltransferase) (PPMT) 
(Farnesyl cysteine carboxyl 

Rattus norvegicus (Rat), 232 
aa (fragment). 


53-94 
1..42 


39/42 (92%) 
40/42(94%) 


8e-17 


Q9R1L8 


Farnesyl cysteine carboxyl 
methyltransferase - Rattus 
norvegicus (Rat), 33 aa 
(fragment). 


65..94 
1..30 


28/30(93%) 
29/30(96%) 


4e-10 



Example 43. 

The NOV43 clone was analyzed, and the nucleotide and encoded polypeptide 
5 sequences are shown in Table 43A. 



Table 43A. NOV43 Sequence Analysis 




SEQIDNO:181 |2306bp f 


NOV43a, 


GCCATGGCGTCCTGCGTGGGGAGCCGGACCCTAAGCAAGGATGATGTGAACTACAAAATGCATTTCC 
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WO 03/029424 PCT/US02/31373 


CG152256-01 
DNA Sequence 


C^ATGATCAACGAGCAGCAAGTGGAGGACATCACC^ 

CCTGCTCAGCTTCACCATCGTCAGCCTCATGTACTTCGCCTTTACCAGGGATGACTCTGTTCCAGAA 
GACAACATCTGGAGAGGCATCCTCTCTGTTATTTTCTTCTTTCTTATCATCAGTGTGTTAGCTTTCC 
CCAATGGTCCGTTCACTCGACCTCATCCAGCCTTATGGCGAATGGTTTTTGGACTCAGTGTGCTCTA 
CTTCCTGTTCCTGGTATTCCTACTCTTCCTGAATTTCGAGCAGGTTAAATCTCTAATGTATTGGCTA 
GATCCAAATCTTCGATACGCCACAAGGGAAGCAGATGTCATGGAGTATGCTGTGAACTGCCATGTGA 
TCACCTGGGAGAGGATTATCAGCCACTTTGATATTTTTGCATTTGGACATTTCTGGGGCTGGGCCAT 
G AAGGCC TTGC TG ATCCGTAGTTACGGTCTCTGCTGG AC AATCAGTATT ACCTGGGAGCTGAC TG AG 
CTCTTCTTCATGCATCTCCTCCCCAATTTTGCCGAGTGCTGGTGGGATCAAGTCATTCTGGACATCC 
TGTTGTGCAATGGL.GG 1 v^CATTTGGCTGGGC A 1vs(j 1 Clj 1 1 rvsL.1_.L7tj 1 111 iALjALjAl\jAvjbAL 1 1 A 
CCACTGGGL.AAGL. 1 1L,AALjLtAL.A1 ICAl A^L.ACCAULvj^oAAoA1CAAV3AvjAvjU1\»1 IX, ILjL-ALj 1 1\_ 
ACTCCTGCTAGQ-TGGAOt. 1 A JXjTTCGATGG 1 1 l\>AUi-LwAAAiti 1\» 111 1 L, ALtALtALt 1 AL»L. 1 LjLjALj 
TGTACCTTTTL.A1GA1L^TCTG^ 1L>AA1AL.L. 11L1 1L 1 1\jAALjL.A1 AIL. 1 1 lxyi; 
GTTCCAAGCCAGTCATL.L.A1 1AAG1 rGGGGXAGAAl !L,rL»l 1 1A1 1 1 WjL A 1 AW- 1 CtUiLA 
GTGAGACAGTACTACGCTTACCTCACCGALAL.AL^o i\?CAAVjCVjV-Vj 1 ALiLTAAL*AL.AArL7L, 1 tiuu 1L* 1 
TTGGGGCTTTCACCACTTTCCTCTGTCTGTACGGL.AAGA1 1 l\jL7l AlL3LAL7AAL.AL.IAlviL7luAL.L.Lj 
AGAAAAGACCTACTCGGAGTGTGAAGATGGCACCTACAGTCLAGAvaA 1 L.TUv*Tl3L7LJi rUAL-AvjLxAAA 
gggac aaaagg t t c tg aag ac ag c c cac c c aagcatgcagg c aac aac gaaagc c a t i c t tc cagg a 
GaAGGAATCGGCATTCCAAGTCAAAAGTCACCAATGGCGTTGGAAAGAA^ 

tcaaagatgttccagagtgcctagaactgagagggaaatggaactcatttggaactccccgtgaot 


ggtcgaggcgcac agggcaagcaggaagaggcgagggcal. t rcaCTtjtaCa i c a i j. a 1 1 ivfAvja 1 co* i aalt 


tcttgtttcccacagacctggccgcgtcaggcagatcatcgcctggggggcctttgccaacgtgggg 


TCTCTTCTAACTTCAGCACTTGACATGCGGTCACCGGTGGCAGCGCGGTGTGTTGAAGGGaAACGGT 




ttccttcagacgaggcattaaccccatggttaatggactggtcaccagtttttattttatttttatg 


aatctacctttccattgattgatttaagt^ 


tatttgtttttaagttaggatgctttttaacagcctttagaagccgctgctgaaattgatactgggg 


GAAGGGTTCCCCTTCCTTCTAGAGCAGaaAAGGGAGAGAAGTGTTCTATTCCTGTTTGGTAACCTCA 


GTCTCCTGTAAGACCTCCTACCACATGGCGAGTATACACCaATCAGGAGAGGGTAGCTGCCTGCATA 


GGAGCCTCGCTTCCGATTATTCCCTTCCCaATATTATTCATCCAGACTTAGCCACAGTGCACAAAAG 


CaAACCTGCTAGAGAGGCAGTGAACACCACAGCTTCTCCCCAGCTTGGTGCCTTTTACATCGGGTTT 


GTTCTCCTTCCATGGTGTGTTGCTGACATTGTCACTGAGTCCCATGTGAGGTGCTGGTGAGTATTAC 


ctttcatctgtgccatgctctagaaccttgaccttgatagttcaccacgtctgatggatccctgttt 


taaataaaaacgattcactttaaagcct 




ORF Start: ATG at 4 | |0RF Stop: TGA at 1324 






SEQ ID NO: 182 J440 aa JMW at 51772.5kD 


NOV43a, 
CG152256-01 
Protein Sequence 


l^SCVG SRTL SKDDVNYKMHFRMINEQQVEDIT I DFF YRPHTI TLLS FT I VSLMYFAFTRDDS VPX3D 
NIWRGILSVIFFFLI I SVLAFPNGPFTRPHPALWmOTGLSVLYFLFLVFLLFLNFEQVKSIJm^LD 
PNLRYATREADVME YAVNCHVI TWER 1 1 SHFDI F AFGHFWGWAMKALL IRS YGLCWTI S I TWELTEL 
FBMHLLPl^AECWWDQVILDILLCNGGGIWIXSlWVCRFLEi^T^ 

PASWTYVRWFDPKSSFQRVAGVYLFMIIWQLTELISTTFFLKHIFVFQASHPLSWGRILFIGGITAEW 
RQYYAYLTOTQ CKRVGTQCWVF^AFTTFLCL YGMIWYAEHYGHREKT YS EC EDGT YS PE I SWHHRKG 
TKG S EDS P PKHAGNNE SH S S RRRNRH S K S KVTNGVGKK 



Further analysis of the NOV43a protein yielded the following properties shown in 
Tfible43B. 



Table 43B. Protein Sequence Properties NOV43a 


PSort analysis: 


0.6000 probability located in plasma membrane; 0.4000 probability located in 
Golgi body; 0.3000 probability located in endoplasmic reticulum (membrane); 
0.0300 probability located in mitochondrial inner membrane 


SignalP analysis: 


No Known Signal Sequence Predicted 
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WO 03/029424 



PCT/US02/31373 



A search of the NOV43a protein against the Gend&e^ddtebzMra ^<^rf'etsSy- 
database that contains sequences published in patents and patent publication, yielded 
several homologous proteins shown in Table 43C. 
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Table 43C. Geneseq Results for NOV43a 


Geneseq 
Identifier 


Protein/Organism/Length 
[Patent*, Date] 


NOV43a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for 
the Matched 
Region 


Expect 
Value 


ABB89640 


Human polypeptide SEQ ID 
NO 2016 - Homo sapiens, 
473 aa. [WO2001 90304- A2, 
29-NOV-2001] 


L.440 
1..473 


440/473 (93%) 
440/473 (93%) 


0.0 


AAB58945 


Breast and ovarian cancer 
associated antigen protein 
sequence SEQ ID 653 - 
Homo sapiens, 516 aa. 
[WO200055173-A1, 
21-SEP-2000] 


1..440 
44..516 


439/473 (92%) 
439/473 (92%) 


0.0 


ABB71324 


Drosophila melanogaster 
polypeptide SEQ ID NO 
40764 -Prosophila 
melanogaster, 498 aa. 
[WO200171042-A2, 
27-SEP-2001] 


3..359 
59..412 


206/357 (57%) 
276/357 (76%) 


e-133 


AAB73515 


Human transferase HTFS-22, 
SEQIDNO:22-Homo 
sapiens, 487 aa. 
[WO200132888-A2, 
10-MAY-2001] 


22..361 
45..389 


128/351 (36%) 
185/351 (52%) 


2e-60 


AAM79907 


Human protein SEQ ID NO 
3553 - Homo sapiens, 529 aa. 
[WO200157190-A2, 
09-AUG-2001] 


22..361 
63..407 


128/351 (36%) 
185/351 (52%) 


2e-60 



In a BLAST search of public sequence datbases, the NOV43a protein was found to 
have homology to the proteins shown in the BLASTP data in Table 43D. 
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Table 43D. Public BLASTP Results for NOV43a 



266 



WO 03/029424 



PCT/US02/31373 



Protein 

Accession 

Number 


Protein/Organism/Length 


5= 

NOV43a 
Kesiaues/ 
Match 
Residues 


Identities/ 
oinuianues ior 
the Matched 
Portion 


Value 


P48651 


rnospnatiQyi serine symnase i 
(Serine-exchange enzyme I) 
(EC 2.7.8.-) - Homo sapiens 
(Human), 473 aa. 


1..473 


440/47*3 (V^VaS 
440/473(93%) ' 


00 

v/.v 


Q99LH2 


Similar to phosphatidylserine 
synthase 1 - Mus musculus 
(Mouse), 4 15 aa. 


1..440 
1..473 


428/473 (90%) 
437/473 (91%) 


0.0 


Q00576 


Phosphatidylserine synthase I 
(Serine-exchange enzyme I) 
(bC z.v.o.-} - uncetuius 
longicaudatus (Long-tailed 
hamster) (Chinese hamster), 
471 aa. 


1..440 
1..471 


428/473 (90%) 
434/473 (91%) 


0.0 


055024 


Phosphatidylserine synthase-1 
- Mus musculus (Mouse), 473 
aa. 


1..440 
1..473 


421/473 (89%) 
432/473 (91%) 


0.0 


Q9BSY0 


Similar to phosphatidylserine 
synthase 1 - Homo sapiens 
(Human), 334 aa (fragment). 


145..440 
6..334 


292/329 (88%) 
293/329(88%) 


e-178 



PFam analysis predicts that the NOV43a protein contains the domains shown in the 
Table 43E. 

5 



Table 43E. Domain Analysis of NOV43a 


Pfara Domain 


NO V43a Match Region 


Identities/ 
Similarities 

for the Matched Region 


Expect Value 


COLFI 


119-137 


10/19 (53%) 
14/19 (74%) 


0.12 


PSS 


96..370 


179/310(58%) 
267/310(86%) 


l.le-206 



Example 44. 

10 The NOV44 clone was analyzed, and the nucleotide and encoded polypeptide 

sequences are shown in Table 44A. 
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Table 44A. NOV44 Sequence Analysis 




SEQIDNO:183 |ll51 bp j 


NOV44a, 
CG171804-01 
DNA Sequence 


CNTGNATTTGGCCGGGGGGCCATGTAGCTCCGAGCGGCGGATCGCGAGCCTCCTGCGAACCCCAGCC 


top Apnpppp^TTAfiPATTPGfippfSf^Af^ATPprcrcpAfsw^fSA atpty^a Anf^pfircTn a a a a apoTTi 


CGTCCTGCCCTCGCCCGGCCTCTCCATTCGTCCCCCGGGTAGAGAGGGTCGGCTCGTGCTCATCATC 


C TGTGCTCCGTGGTCTTCTC TGCCGTC TAC ATCCTCCTGTGCTGCTGGGCCGGCCTGC C CC TCTGCC 


TGGCCACCTGCCTGGACCACCACTTCCCCACAGGCTCCAGGCCCACTGTGCCGGGACCCCTG^ 


CAGTGGATATAGCAGTGTGCCAGATGGGAAGCCGCTGGTCCGCGAGCCCTGCCGCAGCTGTGCCGTG 


GTGTCCAGCTCCGGCCAAATGCTGGGCTCAGGCCTGGGTGCTGAGATCGACAGTGCCGAGTGCGTGT 
TCCGCATGAACCAGGCGCCCACCGTGGGCTTTGAGGCGGATGTGGGCCAGCGCAGCACCCTGCGTGT 
CGTCTC ACAC AC AAGCGTG CCGCTGCTGCTGCGCAACTATTC AC AC TACTTCCAGAAGGCCCGAGAC 
ACGCTCTACATGGTGTGGGGCCAGGGCAGGCACATGGACCGGGTGCTCGGCGGCCGCACCTACCGCA 
CGCTGCTGCAGC TC ACCAGGATGTACCCCGGCCTGCAGGTGTACACC TTC ACGGAGCGC ATGATGGC 
CTACTGCGACCAGATCTTCCAGGACGAGACGGGCAAGAACCGGAGGCAGTCGGGCTCCTTCCTCAGC 
ACCGGCTGGTTCACCATGATCCTCGCGCTGGAGCTGTGTGAGGAGATCGTGGTCTATGGGATGGTCA 
GCGACAGCTACTGCAGGGAGAAGAGCCACCCCTCAGTGCCTTACCACTACTTTGAGAAGGGCCGGCT 
AGATGAGTGTCAGATGTACCTGGCACACGAGCAGGCGCCCCGAAGCGCCCACCGCTTCATCACTGAG 
AAGGCGGTCTTCTCCCGCTGGGCCAAGAAGAGGCCCATCGTGTTCGCCCATCCGTCCTGGAGGACTG 
AGTAGCTTCCGTCGTCCTGCCAGCCGCCATGCCGTTGCGAGGCCTCCGGGATGTCCCATCCCAAGCC 


ATCACACTCCAC 




ORF Start: ATG at 421 ! |ORF Stop: TAG at 1075 



5 





SEQ ID NO: 184 |218 aa 


MWat25333.8kD 


NOV44a, 
CG171804-01 
Protein Sequence 


MLGSGLGAEIDSAECVFRMNQAPTVGFEADVGQRSTLRWSOT 

GQGRHMDRVLGGRT YRTLLQLTRMYPGLQVYTFTERMMAYCDQ I FQDETGKNRRQSGS FLS TGWFTM 
I L AL ELCEE I VVYGMVSDSYCREKSHPS VPYHYFEKGRLDECQMYLAHEQAPRSAHRF I TEKAVFSR 
WAKKRPIVFAHPSWRTE 



10 Further analysis of the NOV44a protein yielded the following properties shown in 

Table 44B. 



Table 44B. Protein Sequence Properties NOV44a 


PSort analysis: 


0.6400 probability located in microbody (peroxisome); 0.3000 probability 
located in nucleus; 0.2068 probability located in lysosome (lumen); 0.1000 
probabibty located in mitochondrial matrix space 


SignalP analysis: 


No Known Signal Sequence Predicted 



15 

A search of the NOV44a protein against the Geneseq database, a proprietary 
database that contains sequences published in patents and patent publication, yielded 
several homologous proteins shown in Table 44C 
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Table 44C. Geneseq Results for NOV44a 


Geneseq 
Identifier 


Protein/Organism/Length 
[Patent #, Date] 


NOV44a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for the 
Matched Region 


Expect 
Value 


AAB75350 


Human secreted protein #9 - 
Homo sapiens, 302 aa. 
[WO200100806-A2, 
04-JAN-2001] 


1..218 
85..302 


218/218 (100%) 
218/218 (100%) 


e-128 


AAB61614 


Human protein HP03380 - 
Homo sapiens, 302 aa. 
[WO200102563-A2, 
ll-JAN-2001] 


1..218 
85..302 


218/218 (100%) 
218/218 (100%) 


e-128 " 


AAB25764. 


Human secreted protein SEQ 
ID #76 - Homo sapiens, 302 
aa. [WO200037491-A2, 
29-JUN-2000] 


1..218 
85..302 


218/218 (100%) 
218/218 (100%) 


e-128 


AAB28674 


Human 

carboh vdrate-modifvin £ 
enzyme Incyte ID No: 
983984CD1 - Homo sapiens, 
302 aa. [WO200063351-A2, 
26-OCT-2000] 


1..218 
85..302 


218/218 (100%) 
218/218 (100%) 


e-128 


AAB24495 


Human secreted protein 
sequence encoded by gene 5 
SEQ ID NO: 120 -Homo 
sapiens, 345 aa. 
[WO200035937-A1, 
22-JUN-2000] 


1..218 
128..345 


217/218 (99%) 
217/218 (99%) 


e-128 



5 In a BLAST search of public sequence datbases, the NOV44a protein was found to 

have homology to the proteins shown in the BLASTP data in Table 44D. 



Table 44D. Public BLASTP Results for NOV44a 


Protein 

Accession 

Number 


Protein/Organism/Length 


NOV44a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for 
the Matched 
Portion 


Expect 
Value 
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Q9H4F1 


Alpha-N-acetyl-neuraminyl-2,3-beta-gala 
ctosyl-l,3-N- acetylgalactosaminide j 

(NeuAc-alpha-2,3-Gal-beta«l,3-GalNAc-a 
lpha-2, 6-sialyltransferase) (ST6GalNAc 
IV) (Sialyltransferase 7D) - Homo sapiens 
(Human), 302 aa. 


PL In 

1..218 

85..302 


21§fel8 
(100%) 
218/218 
(100%) 


Mi "It i 
e-128 


Q9H4F1 


AJpha2,6-sialyltransferase - Homo sapiens 
(Human), 302 aa. 


1.-218 
85..302 


217/218 (99%) 
218/218 (99%) 


e-128 


Q9NWU6 


CDNA FLJ20593 fis, clone KAT08984 - 
Homo sapiens (Human), 302 aa. 


1..218 
85..302 


217/218 (99%) 
217/218 (99%) 


e-127 


Q9UKU1 


NeuAc-alpha-2,3-Gal-beta-l,3-GalNAc-al 
pha-2, 6-sialyltransferase 
alpha2,6-sialyltransferase - Homo sapiens 
(Human), 302 aa. 


1..218 
85..302 


216/218 (99%) 
216/218 (99%) 


e-127 


Q9R2B6 


Alpha-N-acetyl-neuraminyl-2,3-beta-gala 
ctosyl-l,3-N- acetylgalactosaminide 
alpha-2,6-sialyltransferase (EC 2.4.99.7) 
(NeuAc-alpha-2,3-Gal-beta-l,3-GalNAc-a 
lpha-2, 6-sialyltransferase) (ST6GalNAc 
IV) (Sialyltransferase 7D) - Mus musculus 
(Mouse), 360 aa. 


1..218 
143..360 


202/218 (92%) 
207/218 (94%) 


e-118 



PFam analysis predicts that the NOV44a protein contains the domains shown in the 
Table 44E. 

5 



Table 44E. Domain Analysis of NOV44a 


Pfam Domain 


NOV44a Match 
Region 


Identities/ 
Similarities 
for the Matched 
Region 


Expect Value 


Glyco_transf_29 


1..202 


65/324(20%) 
184/324 (57%) 


6e-43 



Example 45. 

10 The NOV45 clone was analyzed, and the nucleotide and encoded polypeptide 

sequences are shown in Table 45A. 



jTable 45A, NOV45 Sequence Analysis 
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SEQ ID NO: 185 



NOV45a, 
CG171841-01 
DNA Sequence 



AGGACTCCAAGCGCCATGGCCGCTGCCGCCCGAGCCCGGGTCGCGTACTTGCTGAGOCAACTGCMC 



jl475 bp 



P C T / O S 01 B / 3 JL 3 7 



GCGCAGCA TGGCTGTTTCAAATATTAGATATGGAGCAGCAGTTACAAAGGAAGTAGGAATGGCAGAC 
CTAAAAAACATGGGTGCTAAAAATGTGTGCTTGATGACAGACAAGAACCTCTCCAAGCTCCCTCCTG 
TGCAAGTAGCTATGGATTCCCTAGTGAAGAATGGCATCCCCTTTACGGTTTATGATAATGTGAGAGT 
GGAACCAACGGATAGCTTCATGGAAGCTATTGAGTTTGCCCAAAAGGGAGCTTTTGATGCCTATGTT 
GCTGTCGGTGGTGGCTCTACCATGGACACCTGTAAGGCTGCTAATCTGTATGCATCCAGCCCTCATT 

TCTGATTGCAGTGCCAACTACCTCAGGAACCGGGAGTGAAACTACTGGGGTTGCC7VTTTTTGACTAT 
GAACACTTGAAAGTAAAAATTGGCATCACTTCGAGAGCCATCAAACCCACACTGGGACTGATTGATC 
CTCTGCACACCCTCCACATGCCTGCCCGAGTGGTCGCCAACAGTGGCTTTGATGTGTTTAGCCATGC 
CCTGGAGTCATACACCACCCTGCCCTACCACCTGCGGAGCCCCTGCCCTTCAAATCCCATCACACGG 
CCTGCGTACCAGGGCAGCAACCCAATCAGTGACATTTGGGCTATCCACGCGCTGCGGATCGTGGCTA 
AGTATC TG AAGGCTGTC AGAAATCCCGATG ATCTTGAAGCAAGGTCTCATATGC AC TTGGC AAGTGC 
TTTTGCTGGCATCGGCTTTGGAAATGCTGGTGTTCATCTGCATGGAATGTCTTACCCAATTTCAGGT 
TTAGTGAAGATGTATAAAGCAAAGGATTACAATGTGGATCACCCACTGGTGCCCCATGGCCTTTCTG 
TGGTGCTCACGTCCCCAGCGGTGTTCACTTTCACGGCCCAGATGTTTCCAGAGCGACACCTGGAGAT 



ACGCTCCGGAAATTCTTATTCGATCTGGATGTTGATGATGGCCTAGCAGCTGTTGGTTACTCCAAAG 
CTGATATCCCCGCACTAGTGAAAGGAACGCTGCCCCAGGAAAGGGTCACCAAGCTTGCACCCTGTCC 
CC AGTC AG AAGAGG AT C TGG C TGC TC TGTTTGAAG CT TC AATG AAAC TGTATT AAT TGTC ATT T T AA 



CTGAAAGAATTACCGCTGGCCATTGTAGTGCTGAGAGCA AGAGCTGATCTAGCTAGGGCTTTGTCTT 



TTCATCTTTGCGCATAACTTACCTGTTACCAGTATAGGTGGGATATAC ATTTATCTTGCAGGAAATT 



ORF Start: ATG at 75 



lORFStop: TAAat 1326 





SEQ ID NO: 186 |417 aa |MW at 44871. 2kD 


NOV45a, 
CG171841-01 
Protein Sequence 


MAVSN I R YG AAV TK EVGMADLKNMGAKNVC LMTDKNLSKL P PVQVAMDSLVKNG I PF TVYDNVRVEP 
TDSFMEAI EFAQKG AFDAYVAVGGG STMDTCKAANLYAS S PHSDFLDYVS AP IGKGK PVSVPLKPL I 
AVPTTSGTGSETTGVAIFDYEHLKVKIGITSRAIKPTljGLIDPLHTLHMPARVVANSGFDVFSHALE 
SYTTLPYHLRSPCPSNPITOPAYQGSNPISDIWAIHALRIVAKYLKAVRNPDDLEARSHMHLASAFA 
GIGFGNAGVHLHGMS YP I SGLVKMYKAKDYNVDHPLVPHGLSVVLTS PAVFTFTAQMFPERHLEMAE 
LLGADTRTAR I QDAGLVLADTLRKFLFDLDVDDGLAAVGYSKAD I PALVKGTLPQERVTKL APC PQS 
EEDLAALFEASMKLY 



Further analysis of the NOV45a protein yielded the following properties shown in 
Table 45B. 

10 



Table 45B. Protein Sequence Properties NOV45a 


PSort analysis: 


0.4500 probability located in cytoplasm; 03188 probability located in 
microbody (peroxisome); 0.2355 probability located in lysosome (lumen); 
0.1000 probability located in mitochondrial matrix space 


SignalP analysis: 


No Known Signal Sequence Predicted 



A search of the NOV45a protein against the Geneseq database, a proprietary 
15 database that contains sequences published in patents and patent publication, yielded 
several homologous proteins shown in Table 45C. 
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Table 45C. Gencseq Results for NOV45a 


Geneseq 
Identifier 


Protein/Organism/Length 
[Patent*, Date] 


NOV45a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for 
the Matched 
Region 


Expect 
Value 


AAE21522 


Human dehydrogenase 
DHDR-6 protein - Homo 
sapiens, 467 aa. 
[WO200216562-A2, 
28-FEB-2002] 


1..417 
49..467 


413/420 (98%) 
414/420(98%) 


0.0 


AAd73ooo 


Human oxidoreductase 
protein ORP-19 - Homo 
sapiens, 467 aa. 
[WO200144448-A2, 
21-JUN-2001] 


1 417 

49..467 


4197490 CQR%^ 
413/420 (98%) 


V/»V/ 


ABB59876 


Drosophila melanogaster 
polypeptide SEQ ID NO 
6420 - Drosophila 
melanogaster, 464 aa. 
[WO200171042-A2, 
27-SEP-2001] 


1 AM 

46..464 


Z.J'+fttZAj \\J\J fO) 

3T1IA20 (77%) 


p-14£ 

C'ltU 


ABG08093 


Novel human diagnostic 
protein #8084 - Homo 
sapiens, 268 aa. 
[WO200175067-A2, 
ll-OCT-2001] 


62.322 
1..268 


240/268 (89%) 
243/268 (90%) 


e-131 


AAB42855 


Human ORFX ORF2619 
polypeptide sequence SEQ 
ID NO:5238 - Homo sapiens, 
212 aa. [WO200058473-A2, 
05-OCT-2000] 


247..417 
41..212 


168/172 (97%) 
170/172(98%) 


7e-91 



5 In a BLAST search of public sequence datbases, the NOV45a protein was found to 

have homology to the proteins shown in the BLASTP data in Table 45D. 



Table 45D. Public BLASTP Results for NOV45a 


Protein 

Accession 

Number 


Protein/Organism/Length 


NOV45a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for 
the Matched 
Portion 


Expect 
Value 
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CAD28993 


Sequence 4 from Patent 
WO0216562 - Homo sapiens 
(Human;, 40/ aa. 




1..417 
49..467 


ts a r xrtanansr. 
413/420(98%) 

414/420(98%) 


0.0 


Q96MF9 


CDNA FLJ32430 fis, clone 
SKMUS2001129, weakly 
similar to iNAJU-aepenueni 
methanol dehydrogenase (EC 
1.1.1 .244) - Homo sapiens 
(Human), 419 aa. 


1..417 
1..419 


412/420 (98%) 
413/420 (98%) 


0.0 


Q8R0N6 


Hypothetical 45.0 kDa 
protein - Mus museums 
(Mouse), 419 aa. 


1..417 
1..419 


372/420(88%) 
394/420 (93%) 


0.0 


Q9W265 


T3DH protein - Drosophila 
melanogaster (Fruit fly), 464 
aa. 


1..417 
46..464 


254/420 (60%) 
327/420 (77%) 


e-145 


Q95S86 


GM05887p - Drosophila 
melanogaster (Fruit fly), 425 
aa. 


1..417 
7..425 


254/420 (60%) 
327/420 (77%) 


e-145 



PFam analysis predicts that the NOV45a protein contains the domains shown in the 
Table 45E. 



Table 45E. Domain Analysis of NOV45a 


Pfam Domain 


NOV45a Match Region 


Identities/ 
Similarities 

for the Matched Region 


Expect Value 


Fe-ADH 


4..205 


68/216(31%) 
143/216 (66%) 


5.6e-28 


Fe-ADH 


228..288 


30/68 (44%) 
51/68 (75%) 


2.5e-10 



Example 46. 

The NOV46 clone was analyzed, and the nucleotide and encoded polypeptide 
sequences are shown in Table 46A. 
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Table 46A. NOV46 Sequence Analysis 




SEQE>NO:187 |l310bp | 


NOV46a, 
CG173017-01 
DNA Sequence 


CTACTCTCAGCCAGGAATCATGTCTTGGGCCGCTCGCCCGCCCTTCCTCCCTCAGCGGCATGCCGCA 
GGGCAGTGTGGGCCGGTGGGGGTGCGAAAAGAAATGCATTGTGGGGTCGCGTCCCGGTGGCGGCGGC 
GACGGCCCTGGCTGGATCCCGCAGCGGCGGCGGCGGCGGCGGTGGCAGGCGGAGAACAACAAACCCC 
GGAGCCGGAGCCAGGGGAGGCTGGACGGGACGGGATGGGCGACAGCGGGCGGGGTGGCCCTGGGGCT 
GGCAAACGGCTATGTGCAATCTGCGGGGACAGAAGCTCAGGCAAACACTACGGGGTTTACAGCTGTG 
AGGGTTGCAAGGGCTTCTTCAAACGCACCATCCGCAAAGACCTTACATACTCTTGCCGGGACAACAA 
AGACTGCACAGTGGACAAGCGCCAGCGGAACCGCTGTCAGTACTGCCGCTATCAGAAGTGCCTGGCC 
ACTGGCATGAAGAGGGAGGCGGTACAGGAGGAGCGTCAGCGGGGAAAGGACAGGGATGGGGATGGGG 
AGGGGGCTGGGGGAGCCCCCGAGGAGATGCCTGTGGACAGGATCCTGGAGGCAGAGCTTGCTGTGGA 
ACAGAAGAGTGACCAGGGCGTTGAGGGTCCTGGGGGAACCGGGGGTAGCGGCAGCAGCCCAAATGAC 
CCTGTGACTAACATCTGTCAGGCAGCTGACAAACAGCTATTCACGCTTGTTGAGTGGGCGAAGAGGA 




CCTCATTGCCTCCTTTTCACACCGATCCATTGATGTTCGAGATGGCATCCTCCTTGCCACAGGTCTT 
CACGTGCACCGCAACTCAGCCCATTCAGCAGGAGTAGGAGCCATCTTTGATCGGGTGCTGACAGAGC 
TAGTGTCCAAAATGCGTGACATGAGGATGGACAAGACAGAGCTTGGCTGCCTGAGGGCAATCATTCT 
GTTTAATCCAGATGCCAA9GGCCTCTCCAACCCTAGTGAGGTGGAGGTCCTGCGGGAGAAAGTGTAT 
GCATCACTGGAGACCTACTGCAAACAGAAGTACCCTGAGCAGCAGGGACGGTTTGCCAAGCTGCTGC 
TACGTCTTCCTGCCCTCCGGTCCATTGGCCTTT^AGTGTCTAGAGCATCTGTTTTTCTTCAAGCTCAT 
TGGTGACACCCCCATCGACACCTTCCTCATGGAGATGCTTGAGGCTCCCCATCAACTGGCCTGAGCT 
C AG ACC C AGACGTGGTGCTTC TCC ACAC TGGAGGAGC 




ORF Start: ATG at 20 | )ORF Stop: TGA at 1268 



5 





SEQ ID NO: 188 |416 aa IMW at 45778.7kD 


NOV46a, 
CG173017-01 
Protein Sequence 


MSWAARPPFLPQRHAAGQCGPVGVRKEMHCGVASRWRRRRPWLDPAAAAAAAVAGGEQQTPEPEPGE 

AGRDGMGDSGRGGPGAGKRLCAICGDRSSGKHYGVYSCEGCKGFFKRTIRKDLTYSCRDNKDCTV^ 

RQRNRCQYCRYQKCLATGMKREAVQEERQRGKDRDGDGEGAGGAPEEMPVDRILEAELAVEQKSDQG 

VEGPGGT^SGSSPNDPVTNICQAADKQLFTLVEWAKRIPHFSSLPLDDQVILLRAGWNELLIASFS 

HRS IDVRI^ILLATGLHVHRNSAHSAGVGAIFDRVLTELVSKMRDMRMDKTELGCLRAI ILFNPDAK 

GLSNPSEVEVLREKVYASLETYCKQKYPEQO/SRFAKLLLRLPALRSIGLKCLEHLFFFKLIGiyrPID 

TFLMEMLEAPHOLA 



10 Further analysis of the NOV46a protein yielded the following properties shown in 

Table 46B. 



Table 46B. Protein Sequence Properties NOV46a 


PSort analysis: 


0.9700 probability located in nucleus; 0.3000 probability located in 
microbody (peroxisome); 0.1000 probability located in mitochondrial matrix 
space; 0.1000 probability located in lysosome (lumen) 


SignalP analysis: 


No Known Signal Sequence Predicted 



15 

A search of the NOV46a protein against the Geneseq database, a proprietary 
database that contains sequences published in patents and patent publication, yielded 
several homologous proteins shown in Table 46C. 
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Table 46C. Geneseq Results for NOV46a 


Geneseq 
Identifier 


Protein/Organism/Length 
[Patent #, Date] 


NOV46a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for 
the Matched 
Region 


Expect 
Value 


AAU78297 


Human Retinoid X Receptor 
beta (RXRbeta) protein - 
Homo sapiens, 533 aa. 
[WO200218420-A2, 
07-MAR-2002] 


4L.416 

156..533 | 


346/378 (91%) 
352/378 (92%) 


0.0 


AAR72483 


Human H-2RIIBP - Homo 
sapiens, 533 aa. 
[US5403925-A, 
04-APR-1995] 


41..416 
156..533 


346/378 (91%) 
352/378 (92%) 


0.0 


AAR39468 


hRXR-betal - Homo sapiens, 
533 aa. [W09315216-A, 
05-AUG-1993] 


41..416 
156..533 


346/378 (91%) 
5jU5 io (yZvo) 


0.0 


AAR39469 


hRXR-beta2 - Homo sapiens, 

5l0aa.[WO93l52l6-A, 

05-AUG-1993] 


41..416 
133..510 


345/378 (91%) 
351/378 (92%) 


0.0 


AAY21625 


Ligand binding domain of 
nuclear receptor hRXRbeta - 
Homo sapiens, 525 aa. 
[W09926966-A2, 
03JUN-1999) 


41..416 
148..525 


345/378 (91%) 
351/378 (92%) 


0.0 



5 In a BLAST search of public sequence datbases, the NOV46a protein was found to 

have homology to the proteins shown in the BLASTP data in Table 46D. 



Table 46D. Public BLASTP Results for NOV46a 


Protein 

Accession 

Number 


Protein/Organism/Length 


NOV46a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for 
the Matched 
Portion 


Expect 
Value 


S37781 


retinoid X receptor beta - 
human, 533 aa. 


41..416 
156..533 


346/378 (91%) 
352/378 (92%) 


0.0 


Q95L53 


Retinoid X receptor beta - 
Mustela vison (American 
mink), 525 aa (fragment). 


41..416 
148..525 


346/378 (91%) 
352/378 (92%) 


0.0 
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P28702 


Retinoic acid receptor 
RXR-beta - Homo sapiens 
(Human), 533 aa. 


; r 

41..416 
156..533 


352/378 (92%) 


*0 (T^ ""^ ^ 


A41651 


retinoic acid receptor 
coregulator - rat, 45 1 aa. 


41..416 
74..451 


341/378 (90%) 
349/378 (92%) 


0.0 


D41727 


retinoid X receptor beta - 
mouse, 448 aa. 


41..416 
71..448 


341/378 (90%) 
349/378 (92%) 


0.0 



PFam analysis predicts that the NOV46a protein contains the domains shown in the 
Table 46E. 

5 



Table 46E. Domain Analysis of NOV46a 


Pfam Domain 


NOV46a Match Region 


Identities/ 
Similarities 
for the Matched 
Region 


Expect Value 


zf-C4 


86.. 161 


49/77 (64%) 
73/77 (95%) 


1.5e-54 


hormone_rec 


227..409 


74/207 (36%) 
157/207 (76%) 


3.3e-68 



Example 47. 

10 The NOV47 clone was analyzed, and the nucleotide and encoded polypeptide 

sequences are shown in Table 47A. 



Table 47A. NOV47 Sequence Analysis 




SEQIDNO:189 (l229bp | 


NOV47a, 
CG173347-01 
DNA Sequence 


CCGAGACCATGGGGAAGCTCGTGGCGCTGGTCCTGCTGGGGGTCGGCCTGTCCTTAGTCGGGGAGAT 

GTTTCTGGCGTTTAGAGAAAGGGTGAATGCCTCTCGAGAAGTGGAGCCAGTAGAACCTGAAAACTGC 

CACCTTATTGAGGAACTTGAAAGTGGCTCTGAAGATATTGATATACTTCCTAGTGGGCTGGCTTTTA 

TCTCCAGTCTGCAGGTCTGTTGGAGTTTGCTGGAAGTCCACTCCAGACCCTGTTTGCCTGGGTATCA 

CCAGTGGAGGCTGCAGAACGGCAAATATTGCTGCCTGATTTTTCTTCTGGAAGCTTCATCCCAGAGG 

GGCATCCGCCTGTATGAGGGATTAAAATATCCAGGCATGCCAAACTTTGCGCCAGATGAACCAGGAA 

AAATCTTCTTGATGGATCTGAATGAACAAAACCCAAGGGCACAAGCACTAGAAATCAGTGGTGGATT 

TGACAAAGAATTATTTAATCCACATGGGATCAGTATTTTCATCGACAAAGACAATACTGTGTATCTT 

TATGTTGTGAATC ATCC C CACATG AAGTCCAC TGTGGAG ATATTTAAAT TTG AGGAAC AACAACGTT 

CTCTGGTATACCTGAAAACTATAAAACATGAACTTCTCAAAAGTGTGAATGACATTGTC 

ACC AG AACAGTTCT ATGCCACC AGAGAC C AC TATTTTACCAAC TCC C TCCTGTCATTTTTTGAGATG 

ATCTTGGATCTTCGCTGGACTTATGTTCTTTTCTACAGCCCAAGGGAGGTTAAAGTGGTGGCCAAAG 

GATTTTGTAGTGCCAATGGGATCACAGTCTCAGCAGACCAGAAGTATGTCTATGTAGCTGATGTAGC 

AGCTAAGAACATTCACATAATGGAAAAACATGATAACTGGGATTTAACTCAACTGAAGGTC 

TTGGGCACCTTAGTGGATAACCTGACTGTCGATCCTGCCACAGGAGACATTTTGGCAGGATGCCATC 

CTAATCCTATGAAGCTACTGAACTATAACCCTGAGGACCCTCCAGGATCAGAAGTACTTCGCATCCA 

GAATGTTTTGTCTGAGAAGCCCAGGGTGAGCACCGTGTATGCCAACAATGGCTCTGTGCTTCAGGGC 

ACCTCTGTGGCTTCTGTGTACCATGGGAAAATTCTC^TAGGCACCGTATTTCNCAAAACTCTC 
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— IT ' lU H T O^UlilZ,- 1 3JL^/ 4 : 

GTGAGCTC T AQACTCTAGATAGT „ J 

QRFStait:ATGat9 1 jORF Stop: TAG at 1215 | 





SEQ ID NO: 190 |402 aa |MW at 45160.5kD 


NOV47a, 
CG173347-01 
Protein Sequence 


MGKLVALVLLGVGLSLVGEMFLAFRERVNASREVEPVEPENCHLIEELESGSEDIDILPSGIAFISS 
LQVCTSLLEVHSRPCLPGYHQWRLQNGKYCCLIFLLEASSQRGIRLYEGLKY^ 

V^TI^^SVNDIVVIX3PEQFYATRDHYFTNSLLSFFEMILDIJIWTWLFYSPREVKVVAKGFC 
SANGITVSAMKYVYVADVAAKNIHI^^ 

^^P^PGSF^RIQOTLSEKPRVSTVYAimGSVLQGTSVASVYHGKILIGTVFXKTLYCEL 



Further analysis of the NOV47a protein yielded the following properties shown in 
Table 47B. 



Table 47B. Protein Sequence Properties NOV47a 


PSort analysis: 


0.8200 probability located in outside; 0.1900 probability located in lysosome 
(lumen); 0.1000 probability located in endoplasmic reticulum (membrane); 
0.1000 probability located in endoplasmic reticulum (lumen) 


SignalP analysis: 


Cleavage site between residues 31 and 32 



A search of the NOV47a protein against the Geneseq database, a proprietary 
database that contains sequences published in patents and patent publication, yielded 
several homologous proteins shown in Table 47C. 



Table 47C. Geneseq Results for NOV47a 




Geneseq 
Identifier 


Protein/Organism/Length 
[Patent*, Date] 


NOV47a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for 
the Matched 
Region 


Expect 
Value 


ABB97287 


Novel human protein SEQ ID 
NO: 555 - Homo sapiens, 354 
aa. [WO200222660-A2, 
21-MAR-2002] 


1..402 
1..354 


352/402(87%) 
352/402(87%) 


0.0 


AAG75494 


Human colon cancer antigen 
protein SEQ ID NO:6258 - 
Homo sapiens, 370 aa. 
[WO200122920-A2, 
05-APR-2001) 


2..402 
18..370 


352/401 (87%) 
352/401 (87%) 


0.0 
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ABG08350 


Novel human diagnostic 
protein #8341 - Homo 
sapiens, 382 aa. 
[WO200175067-A2, 
ll-OCT-2001] 


Ft 

1..402 
24..382 


330/407 (81%) 
333/407 (81%) 


' Jii JL .at i; 
e-178 


AAU11925 


Protein sequence of rabbit 
paraoxonase-3 (PON3) 
mutant D324N - Oryctolagus 
cuniculus, 355 aa. 
[WO200190336-A2, 
29-NOV-2001) 


1..402 
1..355 


294/403 (72%) 
318/403 (77%) 


e-164 


AAU11922 


Protein sequence of rabbit 
paraoxonase-3 (PON3) 
mutant N 169D - Oryctolagus 
cuniculus, 355 aa. 
[WO200190336-A2, 
29-NOV-2001] 


1..402 
1..355 


294/403 (72%) 
318/403 (77%) 


e-164 



In a BLAST search of public sequence datbases, the NOV47a protein was found to 
have homology to the proteins shown in the BLASTP data in Table 47D. 
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Table 47D. Public BLASTP Results for NOV47a 




Protein 

Accession 

Number 


Protein/Organism/Length 


NOV47a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for 
the Matched 
Portion 


Expect 
Value 


Q15166 


Serum j 
paraoxonase/arylesterase 3 
(EC 3.1.1.2) (EC 3.1.8.1) 
(PON 3) (Serum 
aryldiakylphosphatase 3) 
(A-esterase 3) (Aromatic 
esterase 3) - Homo sapiens 
(Human), 354 aa. 


1..402 
1..354 


354/402(88%) 
354/402 (88%) 


0.0 


Q9BZH9 


Paraoxanase-3 - Homo 
sapiens (Human), 354 aa 
(fragment). 


1..402 
1..354 


351/402(87%) 
351/402 (87%) 


0.0 


Q9BGN0 


Paraoxonase 3 - Oryctolagus 
cuniculus (Rabbit), 354 aa. 


1..402 
1..354 


293/402 (72%) 
318/402(78%) 


e-164 
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Q62087 


Serum 

paraoxonase/arylesterase 3 
(EC 3.1.1.2) (EC 3.1.8.1) 
(PON 3) (Serum 
aryldiakylphosphatase 3) 
(A-esterase 3) (Aromatic 
esterase 3) - Mus musculus 
(Mouse), 354 aa. 


rjKj 

1..402 
1..354 


283/402 (70%) 
314/402(77%) 


- „tt„ ~j> -t r r 
e-158 


Q90952 


Serum 

paraoxonase/arylesterase 2 
(EC 3.1.1.2) (EC 3.1.8.1) 
(PON 2) (Serum 
aryldiakylphosphatase 2) 
(A-esterase 2) (Aromatic 
esterase 2) - Gallus gallus 
(Chicken), 354 aa. 


1..402 
1..354 


230/402 (57%) 
287/402(71%) 


e-131 



PFam analysis predicts that the NOV47a protein contains the domains shown in the 
Table 47E. 
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Table 47E. Domain Analysis of NOV47a 


Pfam Domain 


NOV47a Match Region 


Identities/ 
Similarities 

for the Matched Region 


Expect Value 


Arylesterase 


2..402 


230/422(55%) 
348/422(82%) 


1.2e-190 



Example 48. 

10 The NOV48 clone was analyzed, and the nucleotide and encoded polypeptide 

sequences are shown in Table 48A. 



Table 48A. NOV48 Sequence Analysis 




SEQIDNO:191 |2109bp | 


NOV48a, 
CG56234-01 
DNA Sequence 


CCTTCCATACCTCCCCGGCTCCGCTCGGTTCCTGGCCACCCCGCAGCCCCTGCCCAGGTGCCATGGC 


CGCATTGTACCGCCCTGGCCTGCGGCTTAACTGGCATGGGCTGAGCCCCTTGGGCTGGCCATCATGC 
CGTAGC ATCC AGACC CTGCGAGTGCTTAGTGGAGATCTGGGCCAGCTTCCCAC TGGC ATTCGAGATT 
TTGTAGAGCACAGTGCCCGCCTGTGCCAACCAGAGGGCATCCACATCTGTGATGGAACTGAGGCTGA 
GAATACTGCCACACTGACCCTGCTGGAGCAGCAGGGCCTCATCCGAAAGCTCCCCAAGTACAATAAC 
TGCTGGCTGGCCCGCACAGACCCCAAGGATGTGGCACGAGTAGAGAGCAAGACGGTGATTGTAACTC 
CTTCTCAGCGGGACACGGTACCACTCCCGCCTGGTGGGGCCCGTGGGCAGCTGGGCAACTGGATGTC 
CCCAGCTGATTTCCAGCGAGCTGTGGATGAGAGGTTTCCAGGCTGCATGCAGGGCCGCACCATGTAT 
GTGCTTCCATTCAGCATGGGTCCTGTGGGCTCCCCGCTGTCCCGCATCGGGGTGCAGCTCACTGACT 
CAGCCTATGTGGTGGCAAGCATGCGTATTATGACCCGACTGGGGACACCTGTGCTTCAGGCCCTGGG 
AGATGGTGACTTTGTCAAGTGTCTGCACTCCGTGGGCCAGCCCCTGACAGGACAAGGGGAGCCAGTG 
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AGCCAGTGGCCGTGCAACCCAGAGAAAACCCTGATT^ 

CCTTCGGCAGCGGCTATGGTGGCAACTCCCTGCTGGGCAAGAAGTGCTTTGCCCTACGCATCGCCTC 
TCGGC TGGCCCGGGATGAGGGCTGGCTGGC AGAGC AC ATGC TGATCCTGGGC ATC ACC AGCCCTGCA 
GGGAAGAAGCGCTATGTGGCAGCCGCCTTCCCTAGTGCCTGTGGCAAGACCAACCTGGCTATGATGC 
GGCCTGCACTGCCAGGCTGGAAAGTGGAGTGTGTGGGGGATGATATTGCTTGGATGAGGTTTGACAG 
TGAAGGTCGACTCCGGGCCATCAACCCTGAGAACGGCTTCTTTGGGGTTGCCCCTGGTACCTCTGCC 
ACCACCAATCCCAACGCCATGGCTACAATCCAGAGTAACACTATTTTTACCAATGTGGCTGAGACCA 
GTGATGGTGGCGTGTACTGGGAGGGCATTGACCAGCCTCTTCCACCTGGTGTTACTGTGACCTCCTG 
GC TGGGCAAACCCTGGAAATC TGGTGAC AAGGAGC CC TGTGCAC ATCCCAAC TCTCGATTTTGTGCC 
CCGGCTCGCCAGTGCCCCATCATGGACCCAGCCTGGGAGGCCCCAGAGGGTGTCCCCATTGACGCCA 
TC ATCTTTGGTGGCCGC AGACCCAAAGGGGTACCCC TGGTATACGAGGCCTTCAAC TGGCGTCATGG 
GGTGTTTGTGGGC AGCGCC ATGCGCTCTGAGTCCAC TGC TGCAGC AGAAC ACAAAGGGAAGATCATC 
ATGC ACGACC CATTTGCCATGCGGCCC TTTTTTGGC TAC AACTTCGGGCAC TACCTGGAACACTGGC 
TGAGCATGGAAGGGCGCAAGGGGGCCCAGCTGCCCCGTATCTTCCATGTCAACTGGTTCCGGCGTGA 
CGAGGCAGGGCACTTCCTGTGGCCAGGCTTTGGGGAGAATGCTCGGGTGCTAGACTGGATCTGCCGG 
CGGTTAGAGGGGGAGGACAGTGCCCGAGAGACACCCATTGGGCTGGTACCAAAGGAAGGAGCCTTGG 
ATCTCAGCGGCCTCAGAGCTATAGACACCACTCAGCTGTTCTCCCTCCCCAAGGACTTCTGGGAACA 
GGAGGTTCGTGACATTCGGAGCTACCTGACAGAGCAGGTCAACCAGGATCTGCCCAAAGAGGTGTTG 
GCTGAGCTTGAGGCCCTGGAGAGACGTGTGCACAAAATGTGACCTGAGGCCCTAGTCTAGCAAGAGG 
ACATAGC AC C C TC ATC TGGG AAT AGGG AAGGC AC CTTGC AG AAAAT ATG AGC AATTTG AT ATTAAC T 


AACATCTTCAATGTGCCATAGACCTTCCCACA 




ORF Start: ATG at 63 | jORF Stop: TGA at 1983 





SEQ ID NO: 192 J640 aa |MW at 70688.2kD 


NOV48a, 
CG56234-01 
Protein Sequence 


MAALYRPGLRLNWHGLSPLGWPSCRSIQTLRVL^^ 

AENTATLTLLEQQGL IRKL PKYNNC WLARTDPKDVAR VESKTVI VT P S QRDTVPLPPGGARGQLGNW 
MS PADFQRAVDERF PGCMQGRTMYVL PFSMGPVGS PLSR IGVQLTDS AYWASMRIMTRLGTPVLQA 
LGDGDFVKCIiHS VGQPLTGQGEPVSQWPCNPEKTL I GHV PDQRE 1 1 SFGSGYGGNSLLGKKCFALR I 
ASRLARDEGWL AEHML ILG ITS PAGKKRYVAAAFPSACGKTNIAMMRPAI* PGWKVECVGDDI AWMRF 
DSEGRLRAINPENGFFGVAPGTSATTNPNAMATIQSNTIFTNVAETSDGGVYWEGIDQPLPPGVTVT 
SWLGKPWKSGDKEPCAHPNSRFCAPARQC PIMDPAWEAPEGVP IDAI I FGGRRPKGVPLVYEAFNWR 
HGVFVGSAMRSESTAAAEHKGKIIMHDPFAMRPFFGYOTGHYLEHWLSMEGRKGAQLPRIFHVNWFR 
RDEAGHFLWPGFGENARVLDWICRRLEGEDSARETPIGLVPKEGALDLSGLRAIDTTQLFSIiPKDFW 
EQEVRDI RS YLTEQVNQDL PKEVXiAELEALERRVHKM 



( 





SEQ ID NO: 193 


2069 bp I 


NOV48b, 
CG56234-02 
DNA Sequence 


CCCGCCTTCCATACCTCCCCGGCTCCGCTCGGTTCCTGGCCACCCCGCAGCCCCTGCCCAGGTGCCA 


TGGCCGCATTGTACCGCCCTGGCCTGCGGCTTAACTGGCATGGGCTGAGCCCCTTGGGCTGGCCATC 
ATGCCGTAGCATCCAGACCCTGCGAGTGCTTAGTGGAGATCTGGGCCAGCTTCCCACTGGCATTCGA 
GATTTTGTAGAGCACAGTGCCCGCCTGTGCCAACCAGAGGGCATCC^CATCTGTGATGGAACTGAGG 
CTGAGAATACTGCCACACTGACCCTGCTGGAGCAGCAGGGCCTCATCCGAAAGCTCCCCAAGTACAA 
TAACTGCTGGCTGGCCCGCACAGACCCCAAGGATGTGGCACGAGTAGAGAGCAAGACGGTGATTGTA 
ACTCCTTCTCAGCGGGACACGGTACCACTCCCGCCTGGTGGGGCCTGTGGGCAGCTGGGCAACTGGA 
TGTCCCCAGCTGATTTCCAGCGAGCTGTGGATGAGAGGTTTCCAGGCTGCATGCAGGGCCGCACCAT 
GTATGTGCTTCCATTCAGCATGGGTCCTGTGGGCTCCCCGCTGTCCCGCATCGGGGTGCAGCTCACT 
GACTCAGCCTATGTGGTGGCAAGCATGCGTATTATGACCCGACTGGGGACACCTGTGCTTCAGGCCC 
TGGGAGATGGTGACTTTGTCAAGTGTCTGCACTCCGTGGGCCAGCCCCTGACAGGACAAGGGGAGCC 
AGTGAGCCAGTGGCCGTGCAACCCAGAGAAAACCCTGATTGGCCACGTGCCCGACCAGCGGGAGATC 
ATCTCCTTCGGCAGCGGCTATGGTGGCAACTCCCTGCTGGGCAAGAAGTGCTTTGCCCTACGCATCG 
CCTCTCGGCTGGCCCGGGATGAGGGCTGGCTGGCAGAGCACATGCTGATCCTGGGCATCACCAGCCC 
TGCAGGGAAGAAGGCGCTATGTGCAGCCGCCTTCCCTAGTGCCTGTGGCAAGACCAACCTGGCTATG 
ATGCGGCCTGCACTGCCAGGCTGGAAAGTGGAGTGTGTGGGGGATGATATTGCTTGGATGAGGTTTG 
ACAGTGAAGGTCGACTCCGGGCCATCAACCCTGAGAACGGCTTCTTTGGGGTTGCCCCTGGTACCTC 
TGCCACC ACCAATCCCAACGCCATGGC TACAATCCAGAGTAAC ACTAT TTTTACCAATGTGGCTGAG 
ACC^GTGATGGTGGCGTGTACTGGG^GGGCATTGACCAGCCTCTTCCACCTGGTGTTACTGTGACCT 
CCTGGCTGGGCAAACCCTGGAAACCTGGTGACAAGGAGCCCTGTGCACATCCCAACTCTCGATTTTG 
TGCCCCGGCTCGCCAGTGCCCCATCATGGACCCAGCCTGGGAGGCCCCAGAGGGTGTCCCCATTGAC 
GCCATCATCTTTGGTGGCCGCAGACCCAAAGGGAAGATCATCATGCACGACCCATTTGCCATGCGGC 
CCTTTTTTGGC TAC AACTTCGGGCACTACCTGGAACACTGGC TG AGCATGGAAGGGCGCAAGGGGGC 
CCAGCTGCCCCGTATCTTCCATGTCAACTGGTTCCGGCGTGACGAGGCAGGGCACTTCCTGTGGCCA 
GGCTT TGGGGAGAATGC TCGG GTG CTAGAC TGG ATC TGCCGG C GGT T AG AGGGGG AGGAC AGTGCC C 
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it 1/UoUi/JlJ /«? 




GAGAGACACCCATTGGGCTGGTGCCAAAGGAAGGAgIM<F(^^ 

C ACCACTC AGCTGTTCTCCCTCC CC AAGGACTTC TGGGAAC AGGAGGTTCGTGACATTCGG AGC TAC 
CTGACAGAGCAGGTCAACCAGGATCTGCCCAAAGAGGTGTTGGCTGAGCTTGAGGCCCTGGAGAGAC 
GTGTGCACAAAATGTGACCTGAGGCCTAGTCTAGCAAGAGGACATAGCACCCTCATCTGGGAATAGG 
GAAGGCACCTTGCAGAAAATATGAGCAATTGATATTAACTAACATCTTCAATGTGCCATAGACCTTC 




CCACAAAGACTGTCCAATAATAAGAGATGCTTATCTATTTTAAAAAAAAAAA^pVA^AAA, 




ORF Start: ATG at 67 j |ORF Stop: TGA at 1 891 





SEQ ID NO: 194 f 608 aa |MW at 67027. IkD 


NOV48b, 
CG56234-02 
Protein Sequence 


MAALYRPGLRLNVmGLSPLGWPSCRSIQTLRVLSGDLGQLPTGIRDFVEHSARLCQPEGIHICDGTE 
AENTATLTLLEQQGLIRKLPKYNNCTOARTDPKDVARVESKTVIVTPSQIU>TVPLPPGGACGQIX3NW 
MS P ADFQRAVDERFPGCMQGRTMYVLPFSMG PVGS PL SR IGVQLTD S AYWASMRIMTRLGTPVLQA 
LGIX3DFVKCLHSVGQPLTGQGEPVSQWPCNPEKTLIGHVPDQREI I SFGSGYGGNSLLGKKCFALRI 
ASRL ARDEGWLAEHML I LGI TS PAGKKALCAAAF PS ACGKTNLAMMR PAL PGWKVECVGDDI AWMRF 
DSEGRLRAINPENGFFGVAPGTSATTNPNAMATIQSNTIFTNVAETSDGGVYWEGIDQPLPPGVTVT 
SWLGKPWKPGDKEPCAHPNSRFCAPARQCPIMDPAWEAPEGVPIDAIIFGGRRPKGKIIMHDPFAMR 
PFFGYNFGHYLEHWLSMEGRKGAQLPRIFHVNWFRRDEAGHFLWPGFGENARVLDWICRRLEGEDSA 
RETPIGLVPKEGALDLSGLRAIDTTQLFSLPKDFWEQEVRDIRSYLTEQVNQDLPKEVLAELEALER 
RVHKM 



Sequence comparison of the above protein sequences yields the following sequence 
relationships shown in Table 48B. 
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Table 48B. Comparison of NOV48a against NOV48b. 


Protein Sequence 


NOV48a Residues/ 


Identities/ 


Match Residues 


Similarities for the Matched Region 


NOV48b 


1..640 


577/640 (90%) 




1..608 


577/640 (90%) 



Further analysis of the NOV48a protein yielded the following properties shown in 
15 Table 48C. 



Table 48C. Protein Sequence Properties NOV48a 


PSort analysis: 


0.6402 probability located in microbody (peroxisome); 0.3000 probability 
located in nucleus; 0.2412 probability located in lysosome (lumen); 0.1000 
probability located in mitochondrial matrix space 


SignalP analysis: 


No Known Signal Sequence Predicted 
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• A search of the NOV48a protein against the GQne^&^nAa^%i^^6^ks^ 
database that contains sequences published in patents and patent publication, yielded 
several homologous proteins shown in Table 48D. 
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Table 48D. Gem 


iseq Results for NOV48a 








Geneseq 
Identifier 


Protein/Organism/Length 
[Patent #, Date] 


NOV48a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for 
the Matched 
Region 


Expect 
Value 


AAY80296 


Human mitochondrial 
phosphoenolpyruvate 
carboxykinase SEQ ID NO;l 
- Homo sapiens, 640 aa. 
[US6030837-A, 
29-FEB-2000] 


1..640 
1..640 


634/640 (99%) 
634/640 (99%) 


0.0 


AAB71890 


Mouse PEPCK-cytosolic 
protein - Mus musculus, 622 
aa. [US6187545-B1, 
13-FEB-2001] 


31..640 
14..622 


440/610 (72%) 
519/610(84%) 


0.0 


AAB71880 


Human PEPCK-cytosolic 
protein - Homo sapiens, 622 
aa. [US6187545-B1, 
13-FEB-2001] 


31..640 
14..622 


438/610 (71%) 
517/610(83%) 


0.0 


ABB65318 


Drosophila melanogaster 
polypeptide SEQ ID NO 
22746 - Drosophila 
melanogaster, 647 aa. 
[WO200171042-A2, 
27-SEP-2001] 


27..640 
3S..647 


394/616 (63%) 
480/616 (76%) 

. .... 


0.0 


ABB65322 


Drosophila melanogaster 
polypeptide SEQ ID NO 
22758 - Drosophila 
melanogaster, 638 aa. 
[WO200171042-A2, 
27-SEP-2001] 


30..640 
29..638 


402/613 (65%) 
469/613 (75%) 


0.0 



In a BLAST search of public sequence datbases, the NOV48a protein was found to 
have homology to the proteins shown in the BLASTP data in Table 48E. 
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Table 48E. Public BLASTP Results for NOV48a 



282 



WO 03/029424 
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Protein 

Accession 

Number 


Protein/Organism/Length 


NOV48a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for 
the Matched 
Portion 


> MMttt (J)., utjp 

Expect 
Value 


Q16822 


Phosphoenolpyruvate 
carboxykinase, mitochondrial 
precursor [GTP] (EC 4.1.1 .32) 
(Phosphoenolpyruvate 
carboxylase) (PEPCK-M) - 
Homo sapiens (Human), 640 
aa. 


1..640 
1..640 


635/640 (99%) 
635/640(99%) 


0.0 


S69546 


phosphoenolpyruvate 
carboxykinase (GTP) (EC 
4.1.1.32) precursor, 
mitochondrial - human, 640 
aa. 


1..640 
1..640 


634/640 (99%) 
634/640 (99%) 


0.0 


Q91Z10 


Similar to 

phosphoenolpyruvate 
carboxykinase 2 
(mitochondrial) - Mus 
musculus (Mouse), 640 aa. 


1..640 
L.640 


590/640(92%) 
609/640 (94%) 


0.0 


Q8R3X7 


Similar to RIKEN cDNA 
9130022B02 gene - Mus 
musculus (Mouse), 535 aa 
(fragment). 


106..640 
1..535 


504/535 (94%) 
518/535 (96%) 


0.0 


P07379 


Phosphoenolpyruvate 
carboxykinase, cytosolic 
[GTP] (EC 4.1.1.32) 
(Phosphoenolpyruvate 
carboxylase) (PEPCK-C) - 
Rattus norvegicus (Rat), 622 
aa. 


31..640 
14..622 


441/610 (72%) 
520/610 (84%) 


0.0 



PFam analysis predicts that the NOV48a protein contains the domains shown in the 
Table 48F. 
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Table 48F. Domain Analysis of NO V48a 


Pfam Domain 


NOV48a Match Region 


Identities/ 
Similarities 

for the Matched Region 


Expect Value 


PEPCK 


46..640 


445/608 (73%) 
591/608 (97%) 


0 
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Example 49. 

The NOV49 clone was analyzed, and the nucleotide and encoded polypeptide 
sequences are shown in Table 49A. 



Table 49A. NOV49 Sequence Analysis 




SEQIDNO:195 |l202bp 




NOV49a, 
CG56836-01 * 
DNA Sequence 


TGT AAGCGATCTGGTTCCCACC TC AGC CTCCCG AGTAGTGTC TTC AGGCC TATGGAGAGC AGCTTGC 




aCATGTGGCAGCTCTGGGCCTCCCTCTGCTGCCTGCTGGTGTTGGCCAATGCCCGGAGCAGGCCCTC 
TTTCC ATCCCC TGTCGGATG AGC TGGTC AACTATGTC AAC AAACGG AATACC ACGTGGCAGGCCGGG 
CACAACTTCTACAACGTGGACATGAGCTACTTGAAGAGGCTATGTGGTACCTTCCTGGGTGGGCCCA 
AGCCAC CCCAGAGAGTTATGTTTACCG AGGACCTGAAGCTGCCTGC AAGC TTCGATGCACGGGAACA 
ATGGCCACAGTGTCCCACCATCAAAGAGATCAGAGACCAGGGCTCCTGTGGCTCCTGCTGGGCCTTC 
GGGGC TGTGGAAGCCATCTC TGACCGG ATCTGC ATCC ACACC AATGCGCACGTCAGCGTGGAGGTGT 
CGGCGGAGGACCTGCTCACATGCTCTGGCAGCATGTGTGGGGACGGCTGTAATGGTGX3CTATCCTGC 
TGAAGCTTGGAACTTCTGGACAAGAAAAGGCCTGGTTTCTGGTGGCCTCTATGAATCCCATGTAGGG 
TGCAGACCGTACTCCATCCCTCCCTGTGAGCACCACGTCAACGGCTCCCGGCCCCCATGCACGGGGG 
AGGGAGATACCCCCAAGTGTAGCAAGATCTGTGAGCCTGGCTACAGCCCGACCTACAAACAGGACAA 
GCACTACGGATACAATTCCTACAGCGTCTCCAATAGCGAGAAGGACATCATGGCCGAGATCTACAAA 




AACACGTCACCGGAGAGATGATGGGTGGCCATGCCATCCGCATCCTGGGCTGGGGAGTGGAGAATGG 
CAC AC CC TACTGGCTGGTTGCCAACTC CTGGAAC ACTGACTGGGGTG AC AATGGCTTCTTTAAAATA 
Cl^AGAGGACAGGATCACTGTGGAATCGAATCAGAAGTGGTGGCTGGAATTCCACGCACCGATCAGT 
ACTGGGAAAAGATCTAATCTGCCGTGGGCCTGTCGTGCCAGTCCTGGGGGCGAGATCGGGGTA 




ORF Start: ATG at 137 


ORF Stop: TAA at 1154 





SEQ ID NO: 196 |339 aa |MW at 37821.3kD 


NOV49a, 
CG56836-01 
Protein Sequence 


MWQLWASLCCLLVLANARSRPSFHPLSDELV^^ 

PPQRVMFTEDLKLPAS FDAREQWPQC PTIKEIRDQGSCGSC WAFGAVEAI SDRI C IHTNAHVSVEVS 
AEDLLTCCGSMCGDGCNGGYPAEAWNFVJTRKGLVSGGLYESHVGCRPYSIPPCEHHVNGSRPPCTGE 
GDTPKC SKICEPGYSPTYKQDKHYG YNS YS VSNS EKDIMAB I YKNG PVBGAFS VYSDFLL YKSGVYQ 
HVTGEIMGGHAIRI LGWGVENGTP YWLVANS WNTDWGDNGFFK ILRGQDHCG I ESEVVAGI PRTDQY 
WEKI 



SEQ ID NO: 197 



1723 bp 



NOV49b, 
CG56836-02 
DNA Sequence 



ACATGGTGGATCTAGGATCCGGCTTCCAACA TGTGGCAGCTCTGGGCCTCCCTCTGCTGCCTGCTGG 



TGTTGGCCAATGCCCGGAGCAGGCCCTCTTTCCATCCCCTGTCGGATGAGCTGGTCAACTATGTCAA 
CAAAC GG AAT AC C ACG TGGC AGG C CGGGC AC AAC TTC T AC AACGTG G AC ATG AGCTAC T TG AAGAGG 
CTATGTGGTACCTTCC TGGGTGGGCCC AAGCCACCCC AGAG AGTTATGTTTAC CGAGGACCTGAAGC 
TGCCT^AAGCTTCGATG^ACGGGAACAATGGCCACAGTGTCCCACCATCAAAGAGATCAGAGACCA 
GGGCTCCTGTGGCTCCTGCTGGGCCTTCGGGGCTGTGGA 

accaatgcgcacgtcagcgtggaggtgtcggcggaggao:tgctcacctgcctgctctacaagtcag 
gagtgtaccaacacgtcaccggagagatcatgggtggccatgccatccgcatcctgggctggggagt 
ggagaatggcacaccctactggctggttcccaactcctggaacactgactggggtgacaatggcttc 
tttaaaatactcagaggac^ggat{^ctgtggaatcgaatcagaagtggtggctggaattccacgca 

CCG AT C AGTAC TGGGAAAAG ATCTAATC TG C CGTG GGC CTGTCGTG CC AAAC C 

ORF Start: ATG at 31 J [ORF Stop: TAA at 694 
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SEQ ID NO: 198 (221 aa |MW at 24974.2kD 


NOV49b, 
CG56836-02 
Protein Sequence 


MWQLV^SLCCLLVLANARSRPSFHPLSDELVNYVNKROT 

PPQRVMPTEDLKLPASPDAREQWPQCPTIKEIRIX^SCGSCWAPGAVEAISDRICIHTNAHVSV^ 
AEDLLTCLLYKSGVYQHVTGE^GGHAIRIIX3WGVENGTPYWLVANSWm i EWGDNGFFKILRGQDHC 
G I ES EWAGI PRTDQYWEK I 






SEQ ID NO: 199 |l028bp j 


NOV49c, 
CG56836-03 
DNA Sequence 


TGTAAGCGATCTGGTTCCCACCTCAGCCTCCCGAGTAGTGTCTTCAGGCCTATGGAGAGGAGCTTGC 


GTGGGCTGGGC C TGC AGT ACC TGGTT TGCATAGATG ATTGGCAGGTGGATCTAGGATCCGG C TTCC A 


ACATGTGGCAGCTCTGGGCCTCCCTCTGCTGCCTGCTGGTGTTGGCCAATGCCCGGAGCAGGCCCTC 
TTTCCATCCCCTGTCGGATGAGCTGGTCAACTATGTCAACAAACGGAATACCACGTGGCAGGCCGGG 
C AC AACTTC TACAACGTGGAC ATGAGCTACTTGAAGAGGC TATGTGGTACCTTCCTGGGTGGGCCC A 
AGCC ACCCC AG AGAGT TATGTTT ACCG AGGACC TGAAG C TGCC TGC AAGCTTCGATGC ACGGGAAC A 
ATGGCCACAGTGTCCCACCATCAAAGAGATCAGAGACCAGGGCTCCTGTGGCTCCTGCTGGGCCTTC 
GGGGCTGTGGAAGK!CATCTCTGACCGGATCTGCATCCACACCAATGCGCACGTCAGCGTGGAGGTGT 
CGGCGG AGGACCTGC TC AC ATGCTGTGGCAGC ATGTGTGGGGACGGCTGTAATGGTGGC TATCCTGC 
TGAAGC TTGG AACTTC TG G AC AAGAAAAGGCC TGG T TTCTGGTGG C CTC T ATGAATC C AAT AGCG AG 
AAGGACATCATGGCCGAGATCTACAAAAACGGCCCCGTGGAGGGAGCTTTCTCTGTGTATTCGGACT 
TC C TGC TCTACAAGTCAGGAGTGTACC AAC ACGTC ACCGGAGAGATGATGGGTGGC CATGCCATCCG 
CATCCTGGGCTGGGGAGTGGAGAATGGCACACCCTACTGGCTGGTTGCCAACTCCTGG^CACTGAC 
TGGGGTGAC^TGGCTTCTTTAAAATACTCAGAGGACAGGATCACTGTGGAATCGAATCAGAAGTGG 
TGGCTGGAATTCCACGCACCGATCAGTACTGGGAAAAGATCTAATCTGCCGTGGGCCTGTCGTGCCA 
GTCCTGGGGGCGAGATCGGGGTA 




ORF Start: ATG at 137 | |ORF Stop: TAA at 980 






SEQ ID NO: 200 |28 1 aa |MW at 3 1423.2kD 


NOV49c, 
CG56836-03 
Protein Sequence 


MWQLWASIXCLLVLANARSRPSFHP^ 

PPQRVMFTEDLKLPASFDAREQWPQCPTIKEIRDQGSCGSCWAFGAVEAISDRICIHTNAHVSVEVS 

AEDLLTCCXSSMCGDGCNGGYPAEAWNFWTRKGLVSGGLYESNSEKDIMAEIYKNGPVEG^ 

LLYXSGWQHWGEMMGGHAIRILGWSVFJJGTPY^ 

AGIPRTDQYWEKI 






SEQ ID NO: 201 |l028bp \ 


NOV49d, 
CG56836-04 
DNA Sequence 


TGTAAGCGATCTGGTTCCCACCTCAGCCTCCCGAGTAGTGTCTTCAGGGCTATGGAGAGC^GCTTGC 


GTGGGCTGGX3CCTGCAGTACCTGGTTTGCATAGATGATTGGCAGGTGGATCTAGGATCCGGGTTCCA 


ACATGTGGCAGCTCTGGGCCTCCCTCTGCTGCCTGCTGGTGTTGGCCAATGCCCGGAGCAGGCCCTC 
TTTCCATCCCC TGTCGGATGAGCTGGTCAACTATGTCAAC AGAC GGAATACCACGTGGCAGGCCGGG 
CACAACTTCTACAACGTGGACATGAGCTACTTGAAGAG^CTATGTGGTACCTTCCTGGGTGGGCCCA 
AGCC AC CCC AGAGAGTTATGTTTACCGAGGACCTGAAGCTGCC TGC AAGCTTCGATGC ACGGGAACA 
ATGGCC AC AGTGTCCC ACC ATC AAAGAGATC AGAGACCAGGGC TCCTGTGGC TCCTGCTGGGTTTC T 
GGTGGCCTCTATGAATCCCATGTAGGGTGCAGACCGTACTCCATCCCTCCCTGTGAGCACCACGTCA 
ACGGCTCO:GGCCCCCATGC^CGGGGGAGGGAGATACCCCCAAGTGTAGCAAGATCTGTGAGCCTGG 
CTACAGCCCGACCTACAAACAGGACAAGCACTACGGATACAATTCCTACAGCGTCTCCAATAGCGAG 
AAGGACATCATGGCCGAGATCTACAAAAACGGCCCCGTGGAGGGAGCTTTCTCTGTGTATTCGGACT 
TCCTGCTCTACAAGTCAGGAGTGTACCAACACGTCACCGGAGAGATGATGGGTGGCCATGCCATCCG 
CATCCTGGGCTG<3GGAGTGGAGAATGGCACACCCTACTGGCTGGTTGCCAACTCCTGGAACACTGAC 
TGGGGTGACAATGGCTTCTTTAAAATACTCAGAGGACAGGATCACTG^ 

IK^CTGGAATTCCACGCACCGATCAGTACTGGGAAAAGATCTAATCTGCCGTGGGCCTGTCGTQCCA 
GTCCTGGGGGCGAGATCGGGGTA 
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jORF Start: ATG at 137 





SEQ ID NO: 202 J281 aa |MW at 3 1732.5kD 


NOV49d, 
CG56836-04 
Protein Sequence 


MWQLWASLCCLLVLANARSRFSFHPLSDELVNYVN^^ 

PPQRVMPTEDLKLPASFDAREQWPQCPTIKEIRDO^SOSSCWSGGLYESHVGCRPYSIPPCEHHVN 
GSRPPCTGEGDTPKCSKICEPGYSPTYKQDKKYGYNSYSVSNSEKDIMAEIYKNGPVEGAFSVYSDF 
LLYKSGVYQ.HVTGEMMGGHAIR I LG WGVENGT P YWLVANS WNTDWGDNGFFK ILRGQDHCG I ESEW 
AG I PRTDQYV7EKI 





SEQ ID NO: 203 


340bp | 


NOV49e, 
247856403 DNA 
Sequence 


AGGCTCCGCGGCCGCXCCCTTCACCGGATCCCTGCCTGCAAGCTTCX3ATGCACGGGAACAATGGCCA 

CAGTGTCCH^CCATCJIAAGAGATCAGAGACCAGGGCTCCTGTGGCTCCTGCTGGGCCTTCGGGGCTG 

TGGAAGC CATCTC TGACCGGATCTGCATCCACACCAATGCGCACGTCAGCGTGGAGGTGTCX5GCGGA 

GGACCTGCTCACATGCTGTGGCAGCATGTGTGGGGACGGCTGTAATGGTGGCTATCCTGCTGAAGCT 

TGGAACTTCTGGACAAGAAAAGGCCTGGTTTCTGGTGGCCTCTATCTC^ 

CCGAC. 




ORF Start: at 2 


ORF Stop: end of sequence 





SEQ ID NO: 204 


113 aa jMWat 11834.0kD 


NOV49e, 
247856403 
Protein Sequence 


GSAAAPFTGSLPASFDAREQWPQCPTIKEIRDQGSCGSCWAFGAVEAISDRICIHTNAHVSVEVSAE 
DLLTCCGSMCGDGCNGGYPAEAWNFWTRKGLVSGGLYLEGKGGRAD 



SEQ IP NO: 205 



376 bp 



NOV49f, 

247856434 

Sequence 



AGGCTCCGCGGCCGCCCCCTTCACCGGATCCTCCAATAGCGAGAAGGACATCATGGCCGAGATCTAC 
AAAAACGGCCCCGTGGAGGGAGCTTTCTCTGTGTATTCGGACTTCCTGCTCTACAAGTCAGGAGTGT 
DNA| ACC ^ CACG TCACCGGAGAGATGATGGGTGGCCATGCCATCCGCATCCTGGGCTGG^ 

TGGCACACCCTACTGGCTGGTTGCCAACTCCTGGAACACTGACTGGGGTGACAATGGCTTCTTTAAA 

AGTACTGGGAAAAGATCCTC GAGGGCAAGGGTGGGCGCGCC 



|ORF Stop: end of sequence 



ORF Start: at 2 





SEQ ID NO: 206 


125 aa 


MWat 13666.1kD 


NOV49f, 
247856434 
Protein Sequence 


GSAAAPFTGSSNSEKDIMAEIYKNGPVEGAFSVYSDFLLYKSGVYQHVTGEmGGHAIRILGWG\^ 
GTPYWLVANSWNTDWGDNGFFK ILRGQDHCG IESEVVAG I PRTDQYWEKILEGKGGRA 
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SEQ ID NO: 207 |574 bp J | 


NOV49g, 
247856497 DNA 
Sequence 


aGGCTCCGCGGCCGCCCCCTTCACCGGATCCATGTGGCAGCTCTGGGCCTCCCTCTGCTGCCTGCTG 
GTGTTGGCCAATGCCCGGAGCAGGCCCTCTTTC 

ACAAACGGAATACCACGTGGCAGGCCGGGCACAACTTCTACAACGTGGACATGAGCTACTTCAAGAG 
GCTATGTGGTACCTTCCTGGGTGGGCCCAAGCCACCCCAGAGAGTTATGTTTACCGAGGACCTGAAG 
CTGCCTGCAAGCTTCGATG(^CGGGAACAATGGCCACAGTGTCCCACCATCAAAGAGATCAGAGACC 

CACCAATGCGCACGTCAGCGTGGAGGTGTCGGCGGAGGACCTGCTCACATGCTGTGGCAGCATGTGT 
GGGGACGGCTGTAATGGTGGCTATCCTGCTGAAGCTTGGJUVCTTCTGGACAAGAAAAGGCCTGGTTT 
CTGGTGGCCTCTATCTCGAGGGCAAGGGTGGGCGCGCC 




ORF Start: at 2 |oRF Stop: end of sequence 






SEQ ED NO: 208 |l91 aa |MW at 20877.5kD 


NOV49g, 
247856497 
Protein Sequence 


GSAAAPFTGSMWQLWASLCCLLVIJVNARSRPSFHPLSD^^ 

LCGTFI^GPKPPQRVMFTBDLKLPASFDAREQWPQCPTIKEIRDQGSCGSCWAFGAVEAISDRICIH 
TNAHV S VEVS AEDL.LTCCG SMCGDGCNGGY P ASA WNFWTRKGL VS GGL YLEGKGGRA 





SEQ ID NO: 209 |590bp | 


NOV49H, 
247856493 DNA 
Sequence 


AGGCTCCGCGGCCGCCCCCTTCACCGGATCCATGTGGCAGCTCTGGGCCTCCCTCTGCTGC 

GTGTTGGCCAATGCCCGGAGCAGGCCCTCTTTCCATCCCGTGTCGGATGAGCTGGTCAACTATGTCA 

ACAAACXSGAATACCACGTGGCAGGCCGGGCACAACTTCTACAACGTGGACATGGGCTACT^ 

GCTATGTGGTACCTTCCTGGGTGGGCCCAAGCCACCCCAGAGAGTTATGTTTACCGAGGACCTGAAG 

CTGCCTGCAAGCTTCGATGCACGGGAACAATGGCCACAGTGTCCCACCATCAAAGAGATCAGAGACC 

AGGGCTCCTGTGGCTCCTGCTCGGCCTTCGGGGC^ 

CACCAATGCGCACGTCAGCGTGGAGGTGTCGGCGGAGGACCTGCTCACCTGCTGTGGCAGCATGTGT 
GGGGACGGCTGTAATGGTGGCTATCCTGCTGAAGCTTGGAACTTCTGGACAAGAAAAGGCCTGGTTT 




ORF Start: at 2 |oRF Stop: end of sequence 






SEQ ED NO: 210 197 aa |MW at 21367.0kD 


NOV49h, 
247856493 
Protein Sequence 


GSAAAPFTGSMWQLWAJSLCCLLVLANARSRPSFHPVSDELVNYVNKRNTTWQ 

LCGTFLGG PK P PQRVMFTEDLKL PASFDARBQWPQC PT IKEIRDQGSCGSCWAFGAVEA I SDR IC IH 

TNAHVSVEVSAEDLLTCCGSMCGIX3CNGGYPA 





SEQ ID NO: 211 js51 bp j 


NOV49i, 


aggctccgcggccgcccccttcaccggatcccggagcaggccctctttccatcccctgtcggatgag' 
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tr> c:..:r-,.^. b « ss in , -a i 


rat -? 


247856574 DNA 
Sequence 


CTGGTCAACTATGTCAACAAACGGAATACCACGTGGCAGGCCGG'GCAE^ 

TGAGCTACTTGAAGAGGCTATGTGGTACCTTCCTGGGTGGGCCCAAGCCACCCCAGAGAGTTATGTT 
TACCGAGGACCTGAAGCTGCCTGCAAGCTTCGATGCACGGGAACAATGGCCACAGTGTCCCACCATC 
AAAGAGATCAGAGACCAGGGCTCCTGTGGCTCCTGCTGGGCCTTCGGGGCTGTGGAAGCCATCTCTG 
ACCGGATCTGCATCCACACCAATGCGCACGTCAGCGTGGAGGTGTCGGCGGAGGACCTGCTCACATG 
C TGTGGC AGCATGTGTGGGGACGGCTCTAATGGTGGC TATCCTG CTG AAGCTTGGAACTTCTGGACA 
AGAAAAGGCCTGGTTTC TGGTGGCCTCTATCTCGAGGGCAAGGGTGGGCGCC CCG ACCC AGC TTTCC 
CGTACAAAGCTGGCA 




ORF Start at 2 |0RF Stop: end of sequence 





SEQ ID NO: 212 |l84 aa |mW at 19933.2kD 


NOV49i, 
247856574 
Protein 
Sequence 


GSAAAPFTGSRSRPSFHPLSDELVNYVNKRNTT^ 

TEDLKLPASFDAREQWPQCPTIKSIRDQGSCGSCWAPGAVEAISDRICIHTNAHVSVEVSAEDLLTC 
CGSMCGDGCNGGYPAEAWNFWTRKGLVSGGLYLEGKGGRPDPAFPYKAGX 





SEQ ID NO: 213 


523 bp 




NOV49j, 
247856545 DNA 
Sequence 


AGGCTCCGCGGCCGCCCCCTTCACCGGATCCCGGAGCAGGCCCTCTTTCCATCCCCTGTCGGATGAG 
CTGGTC AACTATGTCAACAAACGGAATACCACGTGGC AGGC CGGGC ACAACTTC T AC AACGTGGACA 
TGAGCTACTTGAAGAGGCTATGTGGTACCTTCCTGGGTGGGCCCAAGCCACCCCTGAGAGTTATGTT 
TACCGAGGACCTGAAGCTGCCTGCAAGCTTCGATGCACGGGAACAATGGCCACAGTGTCCCACCATC 
AAAGAGATCAGAGACCAGGGCTCCTGT^CTCCTGCTGGGCCTTCGGGGCTGTGGAAGCCATCTCTG 
ACCGGATCTGCATCCACACCAATGCGCACGTCAGCGTGGAGGTGTCGGCGGAGGACCTGCTCACATG 
CTGTGGCGGCATGTGTGGGGACGGCTGTAATGGTGGCTATCCTGCTGAAGCTTGGAACTTCTGGACA 
AGAAAAGGCCTGGTTTCTGGTGGCCTCTATCTCGAGGGCAAGGGTGGGCGCGCC 




ORF Start: at 2 |0RF Stop: end of sequence 





SEQ ID NO: 214 


174 aa |MW at 18915.1kD 


NOV49j, 
247856545 
Protein 
Sequence 


GSAAAPFTGSRSRPSraPLSDELVNYVNKROTTWQ^ 

TEDLKLPASFPAREQV^QCPTIKEIRDQGSCGSCWAFGAVEAISDRICIHTNAHVSVEVSA 
CGGMCGDGCNGGYPAEAWNFWTRKGLVSGGLYLEGKGGRA 





SEQ ID NO: 215 |l036bp | 


NOV49k, 
275480714 DNA 
Sequence 


CACCCTCGAGATGTGGCAGCTCTGGGCCTCCCTCTGCTGCCTGCTGGTGTTGGCCAATGCCCGGAGC 
AGGCCCTCTTTCCATCCCCTGTCGGATGAGCTGGTCAACTATGTCAACAAACGGAATACCACGTGGC 
AGG CCGGGC ACAAC TTCTAC AACG TGG AC ATGAGC TACT TQAAG AGGCT ATGTG GTAC C TTC CTGGG 
TGGGCCCAAGCCACCCCAGAGAGTTATGTTTACCGAGGACCTGAAGCTGCCTGCAAGCTTCGATGCA 
CGGG AACAATGGCCACAGTGTCCCACCATCAAAGAGATCAGAGACCAGGGCTCCTGTGGCTCCTGCT 
GGGCCTTCGGGGCTGTGGAAGCCATCTCTGACCGGATCTGCATCCACACCAATGCGCACGTCAGCGT 
GG AG GTGTCGGC GG AGG AC C TG C TC AC ATGC TGTGGC AGCATG TG TGGGGACGGC TGT AATGGTGGC 
TATCCTGCTGAAGCTTCGAACTTCTGGACAAGAAAAGGCCTGGTTTCTGGTGGCCTCTATGAATCCC 
ATGTAGGGTGCAGACCGTACTCCATCCCTCCCTGTGAGCACCACGTCAACGGCTCCCGGCCCCCATG 
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CACGGGGGAGGGAGATACCCCCAAGTGTAGCAAGA 1 
CAGGACAAGCACTACGGATACAATTCCTACAGCGTCTCCAATAGCGAGAAGGACATCATGGCCGAGA 
TCTACAAAAACGGCCCCGTGGAGGGAGCTTTCTCTGTGTATTCGGACTTCCTGCTCTACAAGTCAGG 
AGTGTACCAACACGTCACCGGAGAGATGATGGGTGGCCATGCCATCCGCATCCTGGGCTGGGGAGTG 
GAGAATGGCACACCCTACTGGCTGGTTGCC^CTCCTGGAACACTGACTGGGGTGACAATGGCTTCT 
TTAAAATACTCAGAGGACAGGATCACTGTGGAATCGAATCAGAAGTGGTGGCTGGAATTCCACGCAC 
CGATCAGTACTGGGAAA AGATCGTCGACGGC 

lORF Stop: end of sequence 



ORF Start: at 2 





SEQ ID NO: 216 . |345 aa |MW at 38435.9kD 


NOV49k t 
275480714 
Protein Sequence 


TLEMWQLWASLCCLLVLANARSRPSFHPLSDEXVNYVNKRNTTWQ 

GPKPPQRVMFTEDLKLPASFDAREQWPQCPTIKEIRDQGSCGSCWAPGAVEAISDRICIHTNAHVSV 
EVSAEDLLTCCGSMCGI)GCNGGYPAEAWNFWTRKGLVSGGLYESHVGCRPYSIPPCEHHVNGSRPPC 
TGEGDTPKC SK ICBPGYS PTYKQDKHYGYNS YSVSNS EKDIMAE I YKNGFVEGAFSVYSDFLIjYKSG 
WQHVTGEMMGGHAIRILGWGVENGTPYWLVANSWNTDWGDNGPFKILRGQDHCGIESEWAGIPRT 
DQYWEKIVDG 



Sequence comparison of the above protein sequences yields the following sequence 
relationships shown in Table 49B. 



Table 49B. Comparison of NOV49a against NOV49b through NOV49k. 


Protein Sequence 


NOV49a Residues/ 


Identities/ 


Match Residues 


Similarities for the Matched Region 


NOV49b 


1..141 


141/141 (100%) 




1..141 


141/141 (100%) 


NOV49c 


1..176 


175/176(99%) 




L.176 


176/176 (99%) 


NOV49d 


1..339 


279/339 (82%) 




1..281 


280/339 (82%) 


NOV49e 


80..180 


96/101 (95%) 




1L.111 


96/101 (95%) 


NOV49f 


233.339 


107/107 (100%) 




11.-117 


107/107 (100%) 


NOV49g 


1..180 


175/180 (97%) 




11..190 


175/180 (97%) 


NOV49h 


1-180 


173/180 (96%) 




11..190 


174/180 (96%) 


NOV49i 


17..181 


159/165 (96%) 




10..174 


160/165 (96%) 


NOV49j 


17..180 


144/164(87%) 


10..173 


145/164 (87%) 
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NOV49k 


1..339 






4..342 


339/339 (100%) 



Further analysis of the NOV49a protein yielded the following properties shown in 
Table 49C. 



Table 49C Protein Sequence Properties NOV49a 


PSort analysis: 


0.3700 probability located in outside; 0.1900 probability located in lysosome 
(lumen); 0.1376 probability located in microbody (peroxisome); 0.1000 
probability located in endoplasmic reticulum (membrane) 


SignalP analysis; 


Cleavage site between residues 18 and 19 



A search of the NOV49a protein against the Geneseq database, a proprietary 
database that contains sequences published in patents and patent publication, yielded 
several homologous proteins shown in Table 49D. 
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Table 49D. Geneseq Results for NOV49a 


Geneseq 
Identifier 


Protein/Organism/Length 
[Patent #, Date] 


NOV49a 

Residues/ 

Match 

1? or! /lit ao 

Residues 


Identities/ 
Similarities for 
the Matched 

KeglOD 


Expect 
Value 


AAR90616 


Anti-procathepsin B 
monoclonal antibody - Homo 
sapiens, 339 aa. 
[JP07309900-A, 

OH MOV IQO^l 


1..339 
1..339 


338/339(99%) 
339/339 (99%) 


0.0 


AAB53470 


Human colon cancer antigen 
protein sequence SEQ ID 
NO: 1010 - Homo sapiens, 
344 aa. [WO200055351-A1/ 

Zl-olir-ZwUJ 


L.339 
6..344 


338/339 (99%) 
338/339(99%) 


0.0 


ABP41147 


Human ovarian antigen 
HOFMP73, SEQ ID 
NO:2279 - Homo sapiens, 
346 aa. [WO200200677-A1, 
03-JAN-2002] 


L.339 
8..346 


290/339(85%) 
317/339(92%) 


0.0 


ABB06116 


Human NS protein sequence 
SEQIDNO:208-Homo 
sapiens, 273 aa. 
[WO200206315-A2, 
24-JAN-2002] 


1..267 
1..267 


266/267(99%) 
266/267 (99%) 


e-167 


ABB65378 


Drosophila melanogaster 
polypeptide SEQ ID NO 
22926 - Drosophila 
melanogaster, 340 aa. 
[WO200171Q42-A2, 
27-SEP-2001] 


13..331 
13..339 


190/330(57%) 
232/330 (69%) 


e-113 



In a BLAST search of public sequence datbases, the NOV49a protein was found to 
5 have homology to the proteins shown in the BLASTP data in Table 49E. 



Table 49E. Public BLASTP Results for NOV49a 


Protein 

Accession 

Number 


Protein/Organism/Length 


NOV49a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for 
the Matched 
Portion 


Expect 
Value 



291 



WO 03/029424 



PCT/US02/31373 



P07858 


Cathepsin B precursor (EC 
3.4.22.1) (Cathepsin Bl) (APP 
secretase) - Homo sapiens 
(Human), 339 aa. 


F 

1..339 
1..339 


338/339 (99%) 
339/339 (99%) 


0.0 


KHBOB 


cathepsin B (EC 3.4.22.1) 
precursor - bovine, 335 aa. 


1..335 
1..335 


280/335 (83%) 
307/335 (91%) 


e-180 


P07688 


Cathepsin B precursor (EC 
3.4.22.1) -Bos taunis 
(Bovine), 335 aa. 


1..335 
1..335 


279/335 (83%) 
307/335 (91%) 


e-180 


rUU/o/ 


Cathepsin B precursor (EC 
3.4.22.1) (Cathepsin Bl) 
(RSG-2) - Rattus norvegicus 
(Rat), 339 aa. 


1..336 
1..336 


265/336 (78%) 
299/336 (88%) 


e-175 


P10605 


Cathepsin B precursor (EC 
3.4.22.1) (Cathepsin Bl)- 
Mus musculus (Mouse), 339 
aa. 


1..336 
1..336 


267/336(79%) 
297/336(87%) 


e-174 



a 



PFam analysis predicts that the NOV49a protein contains the domains shown in the 
Table 49F. 

5 



Table 49F. Domain Analysis of NOV49a 


Pf am Domain 


NOV49a Match Region 


Identities/ 
Similarities 

for the Matched Region 


Expect Value 


Peptidase_Cl 


80..329 


112/344 (33%) 
218/344 (63%) 


1.3e-117 



Example 50. 

10 The NOV50 clone was analyzed, and the nucleotide and encoded polypeptide 

sequences are shown in Table 50A. 



Table 50A. NOV50 Sequence Analysis 




SEQIDNO:217 |960bp J 


NOV50a, 
CG57284-01 
DNA Sequence 


CCCGTCCGAGCCCCGGCCCCAAGTAACRCCGCCGCCCCGGAGCCGCCTTGGAGGTCCCCCTCCCCAC 


TAAGTGCCTCTTTGCATAGCACCAGTCCCCACCCGCACGCTCTCTGGACCACTACAGCTGGACGGGC 


AATGGCGGGTCGGGGAGGCGCACGACGACCCAATGGACCAGCTGCTGGGAACAAGATCTGTCAAT^ 
AAGCTGGTTCTGCTGGGGGAGTCTGCGGTAGGCAAATCCAGCCTCGTCCTCCGCTTTGTCAAGGGAC 
AGTTTCACGAGTACCAGGAGAGCACAATTGGAGCGGCCTTCCTCACACAGACTGTCTGCCTGGATGA 
CACAACAGTCAAGTTTGAGATCTGGGACACAGCTGGACAGGAGCGGTATCACAGCCTGGCCCCCATG 
T ACTATCGGGGGGCCCAGGCTGCCATC G TGGTC TATGACATC AC C AACAC AGATACATTTGCAC GGG 
C CAAGAACTGGGTG AAGGAGC TAGAG AGGC AGGCCAGC CCCA^ 
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CAAGGC AGACCTGGCCAGC AAGAGAGCCG TGG AA 1 
AGTTTGCTGTTCATGGAGACATCAGCAAAGACTGCAATGAACGTGAACGSAAATCTTCATGGCAATAG 
C T AAG AAGCTTCC CAAGAAC GAGCC CC AG AATGC AAC TGGTGC TCCAGGC CGAAAC CG AGG TGTGGA 
rPTrrAGGAGAACAACCCAGCCAGCCGGAGCCAGTGCTGCAGCAACTG AGCCCCCCTTGCCTGCCCG 



CTGCCCCCGCCTCCTCCGCCTGAATGACCCGACTGGAATCCACTCTAACCAATCGCACTTAACGACT 



CGGGCCACCACTGGGGGGGCAGGQQGAGGGGTCCACCATGATTTCTCCATATAATTTTGATCATAGQ 



CCGQAGTGAGTCATTCCACCyG. 



ORF Start: ATG at 136 | jORF Stop: TGA at 784 





SEQ ID NO: 218 |216 aa jMW at 23567.4kD 


NOV50a, 
CG57284-01 
Protein Sequence 


MAGRGGAIUIPNGPAAGNKICQFKLVIjIXjESAVG 

TTVKF E I WDTAG QER YH S L APMYYRGAQ AAI WYD I TN TDT F ARAKNWVKELQRQAS PNI V I ALAGN 
KADL ASKRAVEF QEAQ A YADDNS L L FMETS AKT AMNVNE I FMA I AKKL PKNE PQNATGA PGRNRGVD 
LQENNPASRSQCC SN 





SEQ ID NO: 219 |747bp 




NOVSOb, 
CG57284-03 
DNA Sequence 


CCACTAAGTGCCTCTTTGCATAGCACCAGTCCCCACCCGCACGCTCTCTGGACCACTACAGCTGGAC 


GGGCAATGGCGGGTCGGGGAGGCGCAGCACGACCCAATGGACCAGCTGCTGGGAACAAGATCTGTCA 
ATTTAAGCTGGTTCTGCTGGGGGAGTCTGCGGTAGGCAAATCCAGCCTCGTCCTCCGCTTTGTCAAG 
GGAC AGTTTCACGAGTACC AGGAGAGCAC AAT TGG AGCGGCCTTCCTC AC AC AGACTGTC TGCCTGG 
ATG AC AC AACAGTCAAGTTTGAGATCTGGGACACAGCTGGAC AGGAGCGGTATC ACAGC C TGGCCCC 
CATGTACTATCGGG<3GGCCCAGGCTGCCATCGTGGTCTATGACATCACCAACATCGTCATTGCGCTC 
GCGGGTAACAAGGC AGACCTGGCCAGC AAGAGAGC CGTGGAAT TCC AGGAAGC AC AAGCC TATGC AG 
ACGACAACAGTTTGCTGTTCATGGAGLACATCAGCAAAGACTGCAATGAACGTGAACGAAATCTTCAT 
GGCAATAGCTAAGAAGCTTCCCAAGAACGAGCCCCAGAATGCAACTGGTGCTCCAGGCCGAAACCGA 

GGTGTGGACTTCCAGGAGAACAACCCAGCCAGCCGGAGCCAGTGCTGCAGCAACTGAG£CCCJCCTJS 
CCTGCCCGCTGCCCCCGCCTCCTCCGCCTGAATGACCCGACTGGAATCCACTCTAACCAATCGCACT 






TAACGACTCG 




ORF Start: ATG at 73 [ 


ORF Stop: TGA at 658 





SEQ ID NO: 220 


195 aa |MW at 21039.6kD 


NOVSOb, 
CG57284-03 
Protein Sequence 


MAGRGGAARPNGPAAGNKICQFKLVLI^ESAVGKSSLVLRFVKGQFHEYQESTIGAAFLTQTVCLDD 

TTVKFEIWDTAGQERYHSLAPMYYRGAQAAIVVTO^ 

NSLLFMETSAKTAHNVNEIFRAIAKKLPKNEPQ^^ 





SEQ ID NO: 221 |819 bp 




NOV50c, 
CG57284-02 
DNA Sequence 


AATCGCCTTCCACTAAGTGCX:TCTTTGCATAGC!ACCAGTCCCCACCCGCACGCTCTCTGGACCACTA 


CAGCTGGACGGGCAATGGCGGGTCGGGGAGGCGCAGCACGACCCAATGGACCAGCTGCTGGGAACAA 
GATCTGTCAATTTAAGCTGGTTCTGCTGGGGGAGTCTGCGGTAGG 

TTTGTCAAGGGACAGTTTCACGAGTACCAGGAGAGCACAATTGGAGCGGCCTTCCTCACACAGACTG 

TCTGCCTGGATGACACAACAGTCAAGTTTGAGATCTGGGACACAGCTGGACAGGAGCGGTAT 

CCTGGCCCCCATGTACTATCGGGGGGCCCAGGCTGCCATCGTGGTCTATGACATCACCAACACAGAT 

ACATTTGCACGGGCCAAGAACTGGGTGAAGGAGCTACAGAGGCAGGCCAGCCCCAACATCGTCATTG 

CACTCGCGGGTAACAAGGCAGACCTGGCCAGCAAGAGAGCCGTGGAATTCC AGGAAGC AC AAGCC TA 

TGCAGACGACAACAGTTTGCTGTTCATGGAGACATCAGCAAAGACTGCAATGAACG 
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TTCATGGCAATAGCTAAGAAGCTTCCCAAGAACGAG<L 

ACCGAGGTGTGGACCTCCAGGAGAACAACCCAGCCAGCCGGAGCCAGTGCTGCAGCAACTGAGCCCC 
CCTTGCCTGCCCGCTGCCCCCGCCTCCTCCGCCTGAATGACCCGACTGGAATCCACTCTAACCAATr 




GCACTTAACGACTCG 




ORF Start: ATG at 82 { |0RF Stop: TGA at 730 






SEQ ID NO: 222 J216 aa MW at 23482.3kD 


NOV50c, 
CG57284-02 
Protein Sequence 


MAGRGGAAR PNG PAAGNK I CQFKL VLLGESAVGK S SLVLRFVKGQFHEYQEST IG AAFLTQTVCLDD 
TTVKFEIVn^AGQERYHSLAPMYYRGAQAAI^ 

KADLASKRAVEFQEAQA YADDNSLLFMETS AKTAMNVNE I FMAI AKKLPKNEPQNATGAPGRNRG VD 
LQENNPASRSQCCSN 



Sequence comparison of the above protein sequences yields the following sequence 
relationships shown in Table SOB. 
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Table SOB. Comparison of NOVSOa against NOVSOb and NOVSOc. 


Protein Sequence 


NOVSOa Residues/ 
Match Residues 


Identities/ 

Similarities for the Matched Region 


NOVSOb 


18..216 
18..195 


178/199 (89%) 
178/199 (89%) 


NOVSOc 


18..216 
18..216 


199/199 (100%) 
199/199(100%) 



Further analysis of the NO V50a protein yielded the following properties shown in 
15 Table 50C. 



Table 50C. Protein Sequence Properties NOVSOa 


PSort analysis: 


0.6500 probability located in cytoplasm; 0.2189 probability located in 
lysosome (lumen); 0.1000 probability located in mitochondrial matrix space; 
0.0000 probability located in endoplasmic reticulum (membrane) 


SignalP analysis: 


No Known Signal Sequence Predicted 



20 A search of the NOV50a protein against the Geneseq database, a proprietary 

database that contains sequences published in patents and patent publication, yielded 
several homologous proteins shown in Table SOD. 
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Table 50D. Geneseq Results for NOVSOa 


Geneseq 
Identifier 


Protein/Organism/Length 
[Patent #,Date] 


i iVy V DUU 

Residues/ 

Match 

Residues 


JUcUUUca/ 

Similarities for 
the Matched 
Region 


Expect 
Value 


AAM7Q9?^ 


nujiiaii proicin oEy \\j iwj 
1887 - Homo sapiens, 215 aa. 
[WO200157190-A2, 
09-AUG-2001] 


8..215 


170/7fifc (R(\C? n \ 
i /yi/Ajo \o\jvo) 

194/208 (93%) 


e-iui 




ri u man wm-i amino acio 
sequence - Homo sapiens, 
215 aa. [CA2200794-A, 
24-SEP-1998] 


8..215 


1 fy/ZVo \OV70) 

194/208(93%) 




AAB28187 


Human RAS-relates protein 
RAB-5A - Homo sapiens, 

08-SEP-2000] 


1..197 
L.192 


178/197 (90%) 
186/197 (94%) 


9e-97 


AAM80209 


Human protein SEQ ID NO 
3855 - Homo sapiens, 255 aa. 
[WO200157190-A2, 
09-AUG-2001] 


9..216 
47..255 


172/209 (82%) 
189/209 (90%) 


le-95 


ABB60036 


Drosophila melanogaster 
polypeptide SEQ ID NO 
6900 -Drosophila 
melanogaster, 219 aa. 
[WO200171042-A2, 
27-SEP-2001] 


2..214 
11-218 


159/213(74%) 
177/213 (82%) 


8e-85 



5 In a BLAST search of public sequence datbases, the NOVSOa protein was found to 

have homology to the proteins shown in the BLASTP data in Table 50E. 



Table 50E. Public BLASTP Results for NOVSOa 


Protein 

Accession 

Number 


Protein/Organism/Length 


NOV50a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for the 
Matched Portion 


Expect 
Value 


P51148 


Ras-related protein Rab-5C 
(RAB5L)(L1880)-Homo 
sapiens (Human), 216 aa. 


1..216 
1..216 


216/216(100%) 
216/216(100%) 


e-122 
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CT/UODS 
215/216(99%) 

215/216(99%) 



e-121 



AAM21086 



Small GTP binding protein 
RAB5C - Homo sapiens 
(Human), 216 aa. 



1..216 
1..216 



Q8R1V8 



Hypothetical 23.4 kDa 
protein - Mus musculus 
(Mouse), 216 aa. 



1..216 
1..216 



212/216 (98%) 
213/216(98%) 



e-119 



P51147 



Ras-related protein Rab-5C - 
Canis familiaris (Dog), 216 
aa. 



1..216 
1..216 



212/216 (98%) 
213/216 (98%) 



e-119 



Q98932 



Rab5C-like protein - Gallus 
gallus (Chicken), 216 aa. 



1..216 
1..216 



203/216 (93%) 
208/216 (95%) 



e-114 



PFam analysis predicts that the NOV50a protein contains the domains shown in the 
Table 50R 
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Table 50F. Domain Analysis of NO V50a 


Pfam Domain 


NOV50a Match Region 


Identities/ 
Similarities 

for the Matched Region 


Expect Value 


arf 


4..185 


40/198 (20%) 
105/198 (53%) 


0.0018 


ras 


23..216 


90/209 (43%) 
181/209(87%) 


3.1e-104 



Example 51. 

10 The NOV51 clone was analyzed, and the nucleotide and encoded polypeptide 

sequences are shown in Table 51 A. 



Table 51A. NOV51 Sequence Analysis 




SEQIDNO:223 |4826bp | 


NOV51a, 
CG57308-01 
DNA Sequence 


AGCTGAGCCCGAGCCCAGACCGCX3CCCGCGCCGCCATGCCCCTGGCCTTCTGCGGCAGCGAGAACCA 


ctcggccgcctaccgggtggaccagggggtcctcaacaacggctc 
gix^cgcacgtcttcctactcttovtcaccttcccc^ 

ccaaggtgcacatccaccacagcacatggcttcatttccctgggcacaacctgcggtggatcctgac 
cttcatgctgc tcttcgtcc tggtgtgtgagattgcagagggcatcctgtctgatggggtgaccgaa 
tcccaccatctgcacctgtacatgccagccggx^tg<^gttcatggctgctgtcacctccgtggtct 
actatcacaacatcgagacttccaacttccccaagctgctaattgccctgctggtgtat 
ggccttcatcaccaagaccatcaagtttgtcaagttcttggaccacgccatcggcttctcgcagcta 




CGCTTCTGCCTCACAGGGCTGCTGGTGATCCTCTATGGGATGCTGCTCCTCGTGGAGGTCAATGTCA 
TCAGGGTG AGG AGAT AC ATC TTCTTC AAGAC ACD3 AGGG AGGTGAAGCC TCCCGAGGAC C TGCAAG A 
CCTGGG<3GTACGCTTCCTGCAGCCCTTCGTGAATCTGC 

GCCTTCATCAAGACTGCCCACAAGAAGCCCATCGACTTGCGAGCCATCGGGAAGCTGCCCATCGCCA 
TGAGGGCCCTCACCAACTACCAACGGCTCTGCGAGGCCTTTGACGCCCAGGTGCGGAAGGACATTCA 
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GGGC AC TC AAGGTG C CC GGG C CATCTGGC AGGC ACTC^AGti cUlb C C'^ t^G^Ji^CCC^ t!!?G'TC 6 TC^f 1 
AGCAGCACTTTCCGCATCTTGGCCGACCTGCTGGGCTTCGCCGGGCCACTGTGCATCTTTGGGATCG 
TGGACC ACC TTGGGAAGG AG AACG ACGTCTTCC AGCCC AAGAC ACAATTTC TCGGGGTTTACTTTGT 
CTCATCCCAAGAGTTCCTTGCCAATGCCTACGTCTTAGCTGTGCTTCTGTTCCTTGCCCTCCTACTG 
CAAAGGAC ATT TCTGCAAG CAT CC TACT ATGTG GCCATTG AAAC TGG AATTAAC T TG AG AGG AGCAA 
TACAGACCAAGATTTACAATAAAATTATGCACCTGTCCACCTCCAACCTGTCCATGGGAGAAATGAC 
TGCTGGACAGATCTGTAATCTGGTTGCCATCGACACCAATCAGCTCATGTGGTTOTT^ 
CC AAACCTC TGGGCTATGCCAGTACAGATC ATTGTGGGTGTGATTC TCCTCTAC TAC ATACTCGGAG 
TCAGTGCCTTAATTGGAGCAGCTGTCATCATTCTACTG<3CTCC 

GCTGICTCAGGCCCAGCGGAGCACACTGGAGTATTCCAATGAGCGGCTGAAGCAGACCAACGAGATG 
CTCCGCGGCATCAAGCTGCTGAAGCTGTACGCCTGGGAGAACATCTTCCGCACGCGGGTGGAGACGA 
CCCGC AGGAAGGAGATGACCAGCC TC AGGGCCTTTGCC ATCTATAC CTCCATC TCC ATTTTC ATGAA 
C ACGGCCATCCCCATTGCAGCTGTCC TCATAACTTTCGTGGGCCATGTC AGC TTCTTCAAAGAGGCC 



TG CTGTC CAGTG TGGTCC GATC T ACCGTC AAAGCTCT AGTG AGCGTGC AAAAGC TAAGCG AG T TC CT 
GTCCAGTGCAGAGATCCGTGAGGAGCAGTGTGCCCCCCATGAGCCCACACCTCAGGGCCCAGCCAGC 
AAGTACCAGGCGGTGCCCCTCAGGGTTGTGAACCGCAAGCGTCCAGCCCGGGAGGATTGTCGGGGCC 
TC ACCGGCCCAC TGCAGAGCCTGGTCCCC AG TGCAGATGGCG ATGCTG AC AAC TGC TGTGTCC AGAT 
CATGGGAGGCTACTTCACGTGGACCCCAGATGGAATCCCCACACTGTCCAACATCACCATTCGTATC 
CCCCGAGGCCAGCTGACTATGATCGTGGGGCAGGTGGGCTGCGGCAAGTCCTCGCTCCTTCTAGCCG 
CACTGGGGGAGATGCAGAAGGTCTCAGGGGCTGTCTTCTGGAGCAGCCTTCCTGACAGCGAGATAGG 
AGAGGACCCCAGCCCAGAGCGGGAGACAGCGACCGACTTGGATATCAGGAAGAGAGGCCCCGTGGCC 
TATGCTTCGCAGAAACCATGGCTGCTAAATGCCACTGTGGAGGAGAACATCATCTTTGAGAGTCCCT 
TCAACAAACAACGGTACAAGATGGTCATTGAAGCCTGCTCTCTGCAGCCAGACATCGACATCCTGCC 
CCATGGAGACCAGACCCAGATTGGGGAACGGGGCATCAACCTGTCTGGTGGTCAACGCCAGCGAATC 



ATATCCATCTGAGTGACCACTTAATGCAGGCCGGCATCCTTGAGCTGCTCCGGGACGACAAGAGGAC 
AGTGGTCTTAGTGACCCACAAGCTACAGTACCTGCCCCATGCAGACTGGATCATTGCCATGAAGGAT 
GGCACCATCCAGAGGGAGGGTACCCTCAAGGACTTCCAGAGGTCTGAATGCCAGCTCTTTGAGCACT 
GG AAGACCCTC ATGAACCGAC AGGAC CAAGAGCTGG AGAAGGAGAC TGTC AC AGAGAGAAAAGCC AC 
AGAGCCACCCCAGGGCCTATCTCGTGCCATGTCCTCGAGGGATGGCCTTCTGCAGGATGAGGAAGAG 
GAGGAAGAGGAGK3CAGCTGAGAGCGAGGAGGATGACAACCTGTCGTCCATGCTGCACCAGCGTGCTG 
AGATCCCATGGCGAGCCTGCGCCAAGTACCTGTCCTCCGCCGGCATCCTGCTCCTGTCGTTGCTGGT 
CTTCTCACAGCTGCTIXIAAGCACATGGTCCTGGTGGCCATC 

AGCGCCCTGACCCTGACCCCTGCAGCCAGGAACTGCTCCCTCAGCCAGGAGTGCACCCTCGACCAGA 
CTGTCTATGCCATGGTGTTCACGGTGCTCTGCAGCCTGGGCATTGTGCTGTGCCTCGTCACGTCTGT 
C ACTGTGGAGTGGACAGGGCTGAAGGTGGCCAAGAGACTGC AC CGCAGCC TGCT AAACCGGATCATC 
CTAGCCCCCATGAGGTTTTTTGAGACCACGCCCCTTGGGAGCATCCTGAACAGATTTTCATCTGACT 
GTAACACCATCGACCAGCACATCCCATCCACGCTGGAGTGCCTGAGCCGCTCCACCCTGCTCTGTGT 
CTGAGCCCTGGCCGTCATCTCCTATGTCACACCTGTC 

GTGTGCTACTTCATCCAGAAGTACTTCCGGGTX^CGTCCAGGGACCTGCAGCAGCTGGATGACACCA 
CCCAGC TTCC AC TTCTCTC AC AC TTTGCCGAAACCGTAGAAGGACTCAC C ACC ATCCGGGCCTTC AG 
GTATGAGGCCCGGTTOTAGCAGAAGCTTCTCXSAATACACAGACTCCAACAACATTGCTTCCCTCTTC 



cagcggtgacctccatctccaactccctgcacagggagctctctgctggcctggtgg^ctgggcct 
tacc tacgccc taatggtctccaactacctcaactggatggtgaggaac ctggcagacatggagctc 
cagc tgggggc tgtgaagcgc atccatgggctcctgaaaaccgaggcagagagctacgaggggctcc 
tg<;caccatcg<:tgatcccjuuu^ctggccagaccaa 

gcgctacgacagctccctgaagccggtgctgaagcacgtcaatgccctcatctcccctggacagaag 

acacgttcgaagggcac^tcatcattgatggcattgacatcgcc^ 

ctcacgcctctccatcatcctgcaggaccccgtcctcttcagcggcaccatccx^tttaacctggac 

cctgagaggaagtgc tcagatagcacactgtgggaggccctggaaatcgccc agctgaagc tggtgg 

tgaaggcactgccaggaggcctcgatgccatcatcacagaaggcggggagaatttcagccagggaca 

gaggcagctgttctgcctggcccgggccttcgtgaggaagaccagcatcttcatcatggacgaggcc 

acggcttccattgacatggccacggaaaacatcctccaaaaggtggtgatgacagccttcgcagacc 

gcac tgtggtc acc atcgcgc atcgagtgc acaccatc ctgagtgcagacc tggtgatcgtcctgaa 

gcggg^tcccatccttgagttcgataagccagagaagctgctcagccggaaggacagcgtcttcgcc 

tccttcgtcxgtgcagacaagtgacctgccagagcccaagtgccatcccacattcggaccctgccca 

TA 



ORF Start: ATG at 36 | |ORF Stop: TGA at 4779~ 





SEQ ID NO: 224 |l581 aa JmW at 177005.9kD 


NOVSla, 
CG57308-01 
Protein Sequence 


MPLAFCGSENHSAAYRVDQGVLNNGCFVDALNW 

FPGHNLRWI LTFMLLFVLVC E I AEGILSIX5VTESHHLHL YMPAGMAFMAAVTS WYYHNI ETSNFPK 
LLIALLVYWTIjAFXTXTIKFVKFLDHAIGFSQLRFCLTGLLV^ 

REVKPPEDLQDLGVRFLQPFVNLLSKGTYWWMNAFIKTAHKKPIDLRAIGKLPIAMRALTNY 
AFDAOVRKDIOGTOGARAIWOALSHAFGRRLVLSSTFRILADLI^FAGPLCIFGIVDHLGKENDVFO 
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PCT/USOB / 3 A 3 7 3 

rFLQASYYVAIETGlNLRGAIQTKIYNKIMHL 



PKTQFLGVYFVSSQEFIJWAYVLAVLLFLALLLQRTFI^ASYYVAI^^ 
STSNLSMGEMTAGQICNLVAI DTNQLMWFFFLC PNLWAMPVQI IVGVILLYYILGVSALIGAAVI IL 
LAPVQYFVATKLSQAQRSTLEYSNERLKQTNEMLRGIKLLKra 

AI YTS I S IFMNTA I PIAAVL I TFVGHVSFFKEADFS PSVAFASLSLFHI LVTPLFLLSSVVRSTVKA 
LVS VQKL SEFL S S AEIREEQC APHBPTPQGPASKYQAVPLRVVNRKRP AREDCRGLTGPLQSLVPS A 
DGDADNCCVQ IMGG YFTWT PDGI PTLSNI TIRI PRGQL1MIVG QVGCGKS SLLLAALGEMQKVSG AV 
FWS SLPDSEIGEDPS PERETATDLDIRKRGFVAYASQKPWLLNATVEENI I FESPFNKQRYKMVI EA 
CSLQPDIDILPHGDQTQIGERGINLSGGQRQRISVARALYQHANVVFLDDPFSALDIHIiSDHLMQAG 
ILELLRDDKRTVVLVTHKLQYLPHADWIIAMKDGTIQREGTIJa5FQRSECQLFEHWK 
EKETVTERKATEPPQGLSRAMSSRBGU,QDEEEEEEEAAESEEDDm*SSMLHQRAEIPWRACAKYLS 
SAGILLLSLLVFSQLLKHMVXA/AIDYWLAKWTO^ 
LGIVLCXVTSVTVEWTGLKVAXRLHRSLLNRIII^ 

ECLSR STLLCVSALAVI S YVT PVFLVALL PLAI VC YF IQKYFRVASRDLQQLDDTTQLPLLSHFAET 
VEGLTTIRAFRYEARFQQKLLEYTDSNNIASLFLTAAl^WLEVRMEYIGAC\ArLIAAVTSISNSbHR 
ELSAGLVGLGLTYALMVS*T¥LNWMVRNIJ^MBLQ 

QGKI Q IQNLSVRYDS SLKPVLKHVNAL I S PGQKIGICGRTG SGKS SFSL AFFRMVDTFEGHI I IDG I 
DIAKLPLHTLRSRLS I ILQDPVLFSGTIRFNLDPERKCSDSTLWEALEIAQLKLWKALPGGLDAI I 
TEGGENFSQGQRQLFCIJtfUlFVRJCTS IFIMDEATASIDM^^ 
ILSADLVIVLKRGAILEFDKPEKLLSRKDSVFASFVRADK 





SEQIDNO:225 


4745 bp j 


NOV51b, 
CG57308-02 
DNA Sequence 


CGGGGCCCGGGGGGCGGGGGCCTGACGGCCGGGCCGGGCGGCGGAGCTGCAAGGGACAGAGGCGCGG 


CAGGCGCGCGGAGCCAGCGGAGCCAGCTGAGCCCGAGCCCAGCCCGCGCCCGCGCCGCCATGCCCCT 


GGCCTTCTGCGGCAGCGAGAACCACTCGGCCGCCTACCGGGTGGACCAGGGGGTCCTCAACAACGGC 

TGCTTTGTGGACGCGCTCAACGTGGTGCCGCACGTCTTCCTACTCTTCATCACCTTCCCCATCCTCT 

TCATTGGATGGGGAAGTCAGAGCTCCAAGGTGCACATCCACCACAGCACATGGCTTCATTTCCCCGG 

GCACAACCTGCGGTGGATCCTGACCTTCATGCTGCTCTTCX3TCCTGGTGTGTGAGATTGCAGA 

ATCCTGTCTGATGGGGTGACCGAATCCCACCATCTGCACCTGTACATGCCAGCCGGGATGGCGTTCA 

TGGCTGCTGTCACCTCCGTGGTCTACTATCACAACATCGAGACTTCCAACTTCCCCAAGCTGCTAAT 

TGCCCTGCTGGTGTATTGG AC CCTGGCCTTCATCACC AAGACC ATC AAGTTTGTC AAGC TCTTGGAC 

CACGCCATCGGCTTCTCGCAGCTACGCTTCTGCCTCACAGGGCTGCTGGTGATCCTCTATGGGATGC 

TGCTCCTCGTGGAGGTCAATGTCATCAGGGTGAGGAGATACATCTTCTTCAAGACACCGAGGGAGGT 

GAAGCCTCCCGAGGACCTGCAAGACCTGGGGGTACGCTTCCTGCAGCCCTTCGTGAATCTGCCGTCC 

AAAGGCACCTACTGGTGGATGAACGCCTTCATCAAGACTGCCCACAAGAAGCCCATCGACTTGCGAG 

CCATCGGGAAGCTGCCCATCGTTATGAGGGCCCTCACCAACTACCAACGGCTCTGCGAGGCCTTTGA 

CGCCCAGGTGCGGAAGGACATTCAGGGCACTCAAGGTGCCCGGGCCATCTGGCAGGCACTCAGCCAT 

GCCTTCGGGAGGCGCCTGGTCCTCAGCAGCACTTTCCGCATCTTGGCCGACCTGCTGGGCTTCGCCG 

GGCCACTGTGCLATCTTTGGGATCGTGGACCACCTTGGGAAGGAGAACGACGTCTTCCAGCCCAAGAC 

ACAATTTCTCGGGGTTTACTTTGTCTCATCCCAAGAGTTCCTTGCCAATGCCTACGTCTTAGCTGTG 

CTTCTGTTCCTTGCCCTCCTACTGCAAAGGACATTTCTGCAAGCATCCTACTATGTGGCCATTGAAA 

CTGGAATTAACTTGAGAGGAG C AAT AC AG ACCAAGATTTACAATAAAAT TATGC ACC TGTCCACCTC 

CAACCTGTCCATGGGAGAAATGACTGCTGGACAGATCTGTAATCTGGTTGCCATCGACACCAATCAG 

CTC ATGTGG TTTTTCTTCTTGTGCCCAAACCTCTGGGCTATGC C AGTAC AGATC ATTGTGGGTGTGA 

TTCTCCTCTACTACATACTCGGAGTCAGTGCCTTAATTGGAGCAGCTGTCATCATTCTACTGGCTCC 

TGTCCAGTACTTCGTGGCCACCAAGCTGTCTCAGGCCCAGCGGAGCACACTGGAGTATTCCAATGAG 

CGGCTGAAGCAGACCAACGAGATGCTCCGCGGCATCAAGCTGCTGAAGCTGTACGCCTGGGAGAACA 

TCTTCCGCACGCGGGTGGAGACGACCCGCAGGAAGGAGATGACCAGCCTCAGGGCCTTTGCCATCTA 

TACCTCCATCTCCATTTTCATGAACACGGCCATCCCCATTGCAGCTGTCCTCATAACTTTCGTGGGC 

CATGTCAGCTTCTTCAAAGAGGCCGACTTCTCGCCCTCCGTGGCCTTIX^CTCCCTCTCCCTCTTCC 

ATATCTTGGTCACACCGCTGTTCCTGCTGTCCAGTGTGGTCCGATCTACCGTCAAAGCTCTAGTGAG 

CGTGCAAAAGCTAAGCGAGTTCCTGTCCAGTGCAGAGATCCGTGAGGAGCAGTGTGCCCCCCATGAG 

CCCACACCTCAGGGCCCAGCCAGCAAGTACCAGGCGGTGCCCCTCAGGGTTGTGAACCGCAAGCGTC 

CAGCCCGGGAGGATTGTCGGGGCCTCACCGGCCCACTGCAGAGCCTGGTCCCCAGTGCAGATGGCGA 

TGCTGACAACTGCTGTGTCCAGATCATGG<3AGGCTACTTCACGTGGACCCCAGATGGAATCCCCACA 

C TCTCCAAC ATCACCATTCGTATC CCCCGAGGCCAGCTGACTATGATCGTGGGGC AGGTGGGC TGCG 

gcaagtcctcgctccttctagccgcactgggggagatgcagaaggtctcaggggctgtcttctggag 
cagccttcctgacagcgagataggagaggaccccagcccagagcgggagacagcgaccgacttggat 
atcaggaagagaggccccgtggcctatgcttcgcagaaaccatggctgctaaatgccactgtggagg 
agaacatcatctttgagagtcccttcaacaaacaacggtacaagatggtcattgaagcctgctctct 
gcagccagacaox:gacatcctgccccatcgagac<:agacccagattggggaacggggcatcaacctg 

TCTGGTGGTCAACGCCAGCGMTCAGTGTGGCCCX3AGCCCTCTACCAGCACGCCAACGTTGTCTTCT 
TGGATGACCCCTTCTCAGCTCTGGATATCCATCTGAGTGACCACTTAATGCAGGCCGCXATCCTT^ 
GCTGC TCCGGGACGAC AAGAGGAC AGTGGTCT TAGTGACCCAC AAG C T ACAGTAC C TGC C CCATGCA 
GACTGGATCATTGCCATGAAGGATGGC^CCATCCAGAGGGAGGGTACCCTCAAGGACTTCCAGAGGT 
CTGAATGCCAGCTCTTTGAGCACTGGAAGACCCTCATGAACCGACAGGACC1AAGAGCTGGAGAAGGA 
GACTGTCACAGAGAGAAAAGCCACAGAGCCACCCCAGGGCCTATCTCGTGCCATGTCCTCGAGGGAT 
GGCCTTCTGCAGGATGAGGAAGAGGAGGAAGAGGAGGCAGCTGAGAGCGAGGAGGATGACAACCTGT 
CGTCCATGCTGCACCAGCGTGCTGAGATCCCATGGCGAGCCTGCGCCAAGTACCTGTCCTCCGCCGG 
CATCCTGCTCCTGTCGTTGCTGGTCTTCTCACAGCTGCTCAAGCACAT^ 
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X£3 



TACTGGCTGGCCAAGTGGACCGACAGCGCCCTGACCCTGACCCCTGCAGCCAGGAAC 
GCCAGGAGTGCACCCTCGACCAGACTGTCTATGCCATGGTGTTCACGGTGCTCTGCAGCCTGGGCAT 
TGTGCTGTGCCTCGTCACGTCTGTCACTGTGGAGTGGACAGGGCTGAAGGTGGCCAAGAGACTGCAC 
CGCAGCCTGCTAAACCGGATCATCCTAGCCCCCATGAGGTTTTTrGAGACCACGCCCCTTGGGAGCA 
TCCTGAACAGATTTTCATCTGACTGTAACACCATCGACCAGCACATCCCATCCACGCTGGAGTGCCT 
GAGCCGCTCCACCCTGCTCTGTGTCTCAGCCCTGGCCGTCATCTCCTATGTCACACCTGTGTTCCTC 
GTGGCCCTCTTGCCCCTGG C CATCGTGTGC TACTTCATCCAGAAGTACTTCCGGGTGGCGTCCAGGG 
ACCTGCAGCAGCTGGATGACACCACCCAGCTTCCACTTCTCTCACACTTTGCCGAAACCGTAGAAGG 
ACTCACC ACC ATC CGGGCCTTC AGGTATGAGGCCCGGTTCCAGCAGAAGCTT CTCGAATAC ACAG AC 
TCCAACAACATTGCTTCCCTCTTCCTCACAGCTGCCAACAGATGGCTGGAAGTCCGAATGGAGTACA 
TCGGTGC ATGTGTGGTGCTC ATCGC AGCGGTGACCTCCATCTCCAAC TC CCTGC AC AGGGAGCTCTC 
TGCTGGCCTGGTGGGCCTGGGCCTTACCTACGCCCTAATGGTCTCCAACTACCTCAACTGGATGGTG 
AGGAACCTGGCAGACATGGAGCTCCAGCTGGGGGCTGTGAAGCGCATCCATGGGCTCCTGAAAACCG 
AGGCAGAGAGCTACGAGGGGCTCCTGGCACCATCGCTGATCCCAAAGAACTGGCCAGACCAAGGGAA 
G ATCC AGATCC AGAACC TGAGCGTGCGCTACG AC AGCTCCCTGAAGC CGGTGC TGAAGCACGTC AAT 
GCCCTCATCTCCCCTGGACAGAAGATCGGGATCTGCGGCCGCACCGGCAGTGGGAAGTCCTCCTTCT 
CTC TTGCCTTCTTC CGC ATGGTGGACACGTTCGAAGGGCAC ATCATCAC AGAAGGCGGGGAGAATTT 
CAGCC AGGGAC AGAGGC AGCTGTTC TGCCTGGCCCGGGCC TTCGTG AGGAAGACCAGC ATCTTCATC 
ATGGACGAGGCCACGGCTTCCATTGACATGGCCACGGAAAACATCCTCCAAAAGGTGGTGATGACAG 
CCTTCGCAGACCGCACIX5TGGTCACCATCGCGGATCGAGTGCACACCATCCTGAGTGCAGACCTGGT 
GATCGTCCTGAAGCGGGGTGCCATC C TTGAGTTCGATAAGCCAGAGAAGCTGCTCAGCCGGAAGGAC 
AGCGTCTTCGCCTCCTTCGTCCGTGCAGACAAGTG ACCTGCCAGAGCCCAAGTGCCATCCCACATTC 



GGACCCTGCCCATACCCCTGCCTGGGTTTTCTAACTgTAAATCACTTGTAAATAA 



ORF Start: ATG at 1 27 ( |ORF Stop: TGA at 4657 





SEQ1DN0:226 jl510aa 


MWatl69179.9kD 


NOVSlb, 
CG57308-O2 
Protein Sequence 


MPLAFCGSENHSAAYRVDQGVLNNGCFVDALNWPHVFLLFITFPILFIGWGSQSSKVHIHH^ 
FPGHNLRWILTFMLLFVLVCEIAEGILSDGVTESHHL^ 

LLIALLVVOTLAFITKTIKFVKLLDHAIGFSQLRFCLTGLLVILyGMLLLVEVNVIRVRRYIFFKTP 
REVKPPEDLQDLGVRFLQPFVNLPSKGTYWWMNAFIKTAHKKPIDLRAIGKLPIVMRAIj 
AFDAQVRKDI QGTQGARAIWQ ALSKAFGRRLVLSSTFR ILADLLGFAGPLC I FGTVDHLGKENDVFQ 
PKTQFLGVYFVSSQEFLANAYVLAVLLFLALLLQ 

STSNLSMGEMTAGQ ICNLVAIDTNQLMWFFFLCPNLWAMPVQI IVGVILLYYTLGVSALIGAAVI IL 
LAPVQYFVATKLSQAQRSTLEYSNERLKQTNEMLRGIKLLKLYAWFJJIFRTRVETTRRKEOTSI^AF 
AIYTSISIFMNTAIPIAAVLITFVGHVSFFKEADFSPSVAFASLSLFHILVTPLFLLSSVVRSTVKA 
LVSVQKLSEFLSSAEIREEQCAPHEPTPQGPASKYQAVPLRVVNRBCRPAREDCRGIiTGPLQSLVPSA 
DGDADNCCVQIMGGYFTWTPDG I PTLSNITIR I PRGQLTMIVGQVGCGKSSLLLAAIiGEMQKVSGAV 
FWSSLPDSEIGEDPS PERETATDLDIRKRQPVAYAS pKPWLLNATVEENI I FES PFNKQRYKMVI EA 
CSLQPDIDILPHGDQTQIGERGINLSGGQRQRISVARAI^ 

I L ELLRDDKRTWL VTHKLQ YL PHADWI I AMKDGT I QREGTLKDFQRS ECQL FEHWKTLMNRQDQEL 
EKETVTERKATEPPQGLSRAMSSRDGLLQDEEEEEEEAAESEEDDNLSSMLHQRAEIPWRACARYLS 
SAG I LLLSLLWSQLLKHMVLVAIDYWLAKWTDSALTLTPAARNC SLSQECTLDQTVYAMVFTVLC S 
LGIVLCLVTSVTVEWTGLKVAKRLHRSLLNRIILAPMRFFETTPLGSILNRFSSDCNTIDQHIPSTL 
ECLSRSTLLCVSALAVISYVTPVFLVALLPLAJVCYFIQK^^ 

VEGLTTIRAFRYEARFQQKLLEYTDSNNIASLFLTAANRWLEVRMEYIGACVVLIAAVTSI SNSLHR 

ELSAGLVGIX3LTYALMVSNYI,NWMVRlSn^ADMELQLGAVKRIHGLLK 

QGKIQIQNLSWYDSSLKPVLKHVI^ISPGQ^ 

ENFSQGQRQLFCLARAFVRKTS IFIMDEATAS IDMATENILQKVVMTAFADRTVVTIAHRVHTILSA 
DLVT VLKRGAILEFDK PEKLL SRKD S VFASFVRADK 



iSequence comparison of the above protein sequences yields the following sequence 
relationships shown in Table 51B. 



Table 51B. Comparison of NOV51s| against NOV51b. 


Protein Sequence 


NOV51a Residues/ 
Match Residues 


Identities/ 

Similarities for the Matched Region 
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NOVSlb 



1..1406 
1..1406 



1285/1406 (91%) 
1286/1406 (91%) 



Further analysis of the NOV51a protein yielded the following properties shown in 
Table 5 1C. 



Table 51C Protein Sequence Properties NOV51a 


PSort analysis: 


0.8000 probability located in plasma membrane; 0.4000 probability located in 
Golgi body; 03000 probability located in endoplasmic reticulum (membrane); 
0.3000 probability located in microbody (peroxisome) 


Signal P analysis: 


Geavage site between residues 56 and 57 



A search of the NOVSla protein against the Geneseq database, a proprietary 
10 database that contains sequences published in patents and patent publication, yielded 
several homologous proteins shown in Table 5 ID. 



Table 51D. Gem 


jseq Results for NOVSla 








Geneseq 
Identifier 


Protein/Organism/Length 
[Patent*, Date] 


NOVSla 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for the 
Matched Region 


Expect 
Value 


AAW57412 


Homo sapiens sulphonylurea 
receptor - Homo sapiens, 
1580 aa. [W09814571-A1, 
09-APR-1998] 


L.1581 
1..1580 


1530/1582 (96%) 
1540/1582 (96%) 


0.0 


AAR77087 


Rat sulphonylurea receptor - 
Rattus sp, 1582 aa. 
[W09528411-A1, 
26-OCT-1995] 


1..1581 
1..1582 


1477/1582(93%) 
1509/1582 (95%) 


0.0 


AAR77088 


Hamster sulphonylurea 
receptor - Cricetus sp, 1582 
aa. [W09528411-A1, 
26-OCT-1995] 


1..1581 
1..1582 


1469/1582(92%) 
1506/1582 (94%) 


0.0 


AAR77084 


Rat sulphonylurea receptor - 
Rattus sp, 1498 aa. 
[W09528411-A1, 
26-OCT-1995] 


1..1290 
1..1291 


1195/1291(92%) 
1223/1291 (94%) 


0.0 
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AAR77085 


Hamster sulphonylurea 
receptor - Cricetus sp, 1498 
aa. [W09528411-A1, 
26-OCT-1995] 


1 F 

1..1290 

1..1291 


1186/1291 (91%) 
1220/1291 (93%) 


(3 ±37] 

0.0 



3 



In a BLAST search of public sequence datbases, the NOV5 la protein was found to 
have homology to the proteins shown in the BLASTP data in Table 5 IE. 
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Table S1E. Public BLASTP Results for NOVSla 


Protein 

Accession 

Number 


Protein/Organism/Length 


NOVSla 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for the 
Matched Portion 


Expect 
Value 


Q09428 


Sulfonylurea receptor 1 - 
Homo sapiens (Human), 1580 
aa. 


2..1581 
1..1580 


1579/1580(99%) 
1579/1580(99%) 


0.0 


Q09429 


Sulfonylurea receptor 1 - 
Rattus norvegicus (Rat), 1581 
aa. 


2..1581 
1..1581 


1512/1582 (95%) 
1536/1582(96%) 


0.0 


Q09427 


Sulfonylurea receptor 1 - 
Cricetus cricetus 
(Black-bellied hamster), 1581 
aa. 


2..1581 
1..1581 


1498/1582(94%) 
1530/1582(96%) 


0.0 


A56248 


sulfonylurea receptor - golden 
hamster, 1582 aa. 


1..1581 
1..1582 


1469/1582 (92%) 
150671582 (94%) 


0.0 


Q95J92 


Sulphonylurea receptor 2B - 
Oryctolagus cuniculus 
(Rabbit), 1549 aa. 


1..1580 
1..1548 


1076/1581 (68%) 
1277/1581 (80%) 


0.0 



PFam analysis predicts that the NOV51a protein contains the domains shown in the 
10 Table 51F. 



Table 51F. Domain Analysis of NOV51a 


Pfam Domain 


NOVSla Match Region 


Identities/ 
Similarities 
for the Matched 
Region 


Expect Value 


ABCLmembrane 


318..590 


53/287(18%) 
212/287 (74%) 


3.6e46 
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ABC tran 



706..905 
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55/214 (26%) 
154/214(72%) 



1.3e-34 



ABCLmembrane 



1011..1298 



58/292(20%) 
222/292 (76%) 



2.7e-51 



PRK 



1374..1391 



6/19 (32%) 
15/19 (79%) 



0.21 



ABCjran 



137L.1554 



54/199 (27%) 
129/199 (65%) 



5.7e-36 



Example 52. 

The NOV52 clone was analyzed, and the nucleotide and encoded polypeptide 
sequences are shown in Table 52A. 



Table 52A. NOV52 Sequence Analysis 




SEQEDNO:227 |l404 bp J 


NOV52a, 
CG93659-01 
DNA Sequence 


ATOGAGTACATGAGCACTGGAAGTGACAATAAAGAAGAGATTGATTTATTAATTAAACATTTAAATG 

tgtctg atgt aatagac att atgg aaaatctt tatgc aagtg aagagcc ag cag ttt atg aacc cag 

tctaatgaccatgtgtgaagacagtaatcaaaacgatgagcgttc 

caagaggtaccat^ttgtcatcagtcagatatggaactgtggaggatt^ 

atatatccaacactgcaaagcatttttatggacaacgaccacaggaatctggaattttattaaacat 

ggtcatcactcco:aaaatggacgttaccaaatagat^ccgatgttctcctgatcccctggaagctg 

acttacaggaatattggttctgattctattcctcggggcgcct 

atataaagacgaagaaaagaatggcgtgtaaactgatcccagtagatcaatttaagccatctgatgt 

ggaaattcaggcttgcttccggcacgagaacatcgcagagctgtatggcgcagtcctgtggggtgaa 

actctcc^tctctttatggaagcaggcgagggagggtctc 

caatgagagaatttgaaattatttgggtgacaaagcatgttctcaa 

aaagaaagtgatccatcatgatattaaacctagcaacattgttttcatgtccacaaaagctgttttc 

gtggattttggcctaagtgttcaaatcaccgaagatgtctat 

agatttacatgagcccagaggtcatcctgtgcaggggccattcaaccaaagcagacatctacagcct 
gggggccacgctcatccacatgcagacgggcaccccaccctgggtgaagcgctaccctcgctcagcc 
tatccctcctacctgtacataatccacaagcaagcacctccactggaagacattgcagatgactgca 
gtccagggatoagagagctgatagaagcttccctggagagaaaccccaatcaccgcccaagagccgc 
agacctactaaaacatgaggccctgaacccgcccagagaggatcagccacgctgtacgagtctggac 
tctgccctcttggagcgcaagaggctgctgagtaggaaggagctggaacttcctgagaacattgctg 
ATTC TTCG TGC AC AGGAAGCAC cgaggaatc tgagatgc tcaagaggcaacgctctctctacatcga 
cctcg^gctctggctggctacttcaatcttgttcggggacc^ccaacgcttgaatatggct^ 




ORF Start: ATG at 1 | 


ORFStop: TGA at 1402 





SEQH>NO:228 |467 aa 


MWat52896.9kD 


NOV52a, 
CG93659-01 
Protein Sequence 


MEYMSTGSDl^EEIDLLIKHLNVSDVIDIMENLyASEEPAVYEPSLMTMCQDSNQND 

QEVPWL SS VR YGTVBDLLAFANHI SNTAKHF YGQRPQESG I IiLNMV IT PQNGRYQIDSDVLLI PWKIj 

TYRNI GSDF I PRGAFGKVYLAQD I KTKKRMAC KL I PVDQFKPSDVE I QAC FRHEN I AEL YGA VL WGE 

TVHLFMEAGEGG S\H^EKIjES CGPMREFEI IWVTKHVLKGLDFLHSKKVIHHDIKPSNI VPMSTKAVL 

VDFGLSVQMTEDVYFPKDIJ^GTEIYMSPEVILCRGHSTKADIYSIXSATLIHMQTC 

YPSYLYI IHKQAP PLEDI ADDCSPGMREL IEASLERNPNHRPRAAJDLLKHEALNPPREDQPRCTSLD 

SALLERKRLLSRKELELPENIADSSCTGSTEESEMLKRQRSLYIDIXSAIAGYFl^^GPPTLEYG 
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SEQ ID NO: 229 


1430 bp | 


NOV52b, 
CG93659-03 
DNA Sequence 


CTGACACTGCACTGAGCACTTTATGAGCTTGAACTCTGTTAATCCTCACGACCACCTCATGAGACTC 


TCCAGAAAGAGCAACAGTAATGGAGTACATGAGCACTGGAAGTGACAATAAAGAAGAGATTGATTTA 

TTAATTAAACATTTAAATGTGTCTGATGTAATAGACATTATGGAAAATCTTTATGCAAGTGAAGAGC 

CAGCAGTTTATGAACCCAGTCTAATGACCATGTGTCAAGACAGTAATCAJ^CGATGAt^TTCTAA 

GTCTCTGCTGCTTAGTGGCCAAGAGGTACCATGGTTGTCATCAGTCAGATACGGAACTGTGGAGGAT 

TTGCTTGC TTTTGC AAACC ATATATC CAAC ACTGC AAAGCATTTTTATGGAC AACGACC AC AGG AAT 

CTGGAATTTTATTAAACATGGTCATCACTCCCCAAAATGGACGTTACCAAATAGATTCCGATGTTCT 

CCTGATCCCCT^AAGCTGACTTACAGGAATATTGGTTCTGATTTTATTTCTCGGGGCGCCTTTGGA 

AAGGTATACTTGGCACAAGATATAAAGACGAAGAAAAGAATGGCGTGTAAACTGATCCCAGTAGATC 

AATTTAAGCCATCTGATGTGGAAATCCAGGCTTGCTTCCGGCACGAGAACATCGCAGAGCTGTATGG 

CGCAGTCCTGTGGGGTGAAACTGTCCATCTCTTTATGGAAGCAGGCGAGGGAGGGTCTGTTCTGGAG 

AAACTGG AG AGC TGTGG ACC AATGAGAGAATTTGAAATTATTTGGGTG AC AAAGCATGTTC TC AAGG 

GACTTGATTTTCTACACTCAAAGAAAGTGATCCATCATGATATAAACATTTACATGAGCCCAGAGGT 

C ATCC TGTGCAGGGGCCATTCAACCAAAGC AGAC ATC TAC AGCCTGGGGGCC ACGCTC ATC CACATG 

CAGACGGGCACCCCACCCTGGGTGAAGCGCTACCCTCGCTCAGCCTATCCCTCCTACCTGTACATAA 

TCC ACAAG CAAGC ACC TCCACTGG AAGAC AT TGC AGATGACTGCAGTCCAGGG ATGAGAGAGCTGAT 

AGAAGCTTCCCTGG AGAGAAACCC CAATCACCGCCCAAGAGCCGC AGACCTAC TAAAAC ATGAGGCC 

CTGAACCCGCCCAGAGAG^TCAGCCACGCTCTCAGAGTCTGGACTCTGCCCTCTTGGAGCGCAAGA 

GGCTGCTGAGTAGGAAGGAGCTGGAACTTCCTGAGAACATTGCTGATTCTTCGTGCACAGGAAGCAC 

CGAGGAATCTGAGATGCTCAAGAGX^AACGCTCTCTC'PACATCGACCTCGGCGCTCTGGCTGGCTAC 

TTCAATCTTGTTCGGGGACCACCAACGCTTGAATATGGCTGJVAG 

AGACAGCATTGATCTCCTGGAGG 




ORF Start: ATG at 87 | |oRF Stop: TGA at 1 380 
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SEQ ID NO: 230 |431 aa |MW at 48882.2kD 


NOV52b, 
CG93659-03 
Protein Sequence 


MEYMS TG SDNKEE I DLL I KHLNVSDVID IMENL YASEE PAVYEPSLMTMCQDSNQNDERSKSLLLSG 

QEVPWL S SVRYGTVEDLIiAPANHI SWTAKHFYGQRPQESGILLNMVITPQNGRYQ IDSDVLLI PWKL 

TYRNIGSDF I SRGAFGKVYL AQDIKTKKRMACKL I PVDQFKPSDVEIQACFRHENIAELYGAVLWGE 

TVHLFMEAGEGGSVLEKLESCGPMREFEIIWVTKHVLKGIJjFLHSKKVIHHDINI 

STKADI YSIiGATLIHMO/IGTPPWVKRYPRSAYPSYLYI IHKQAPPLEDI ATJGDCSPGMRELI EASLER 

NPI^PRAADLLKHEAIJ^PPREDQPRCQSLDSALLERKRI^SRKELELPENIADSSCTGSTEESE^ 

KRQRSLYIDLGALAGYFNLVRGPPTLEYG 
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SEQ ID NO: 231 (1538 bp j 


NOV52c, 
CG93659-02 
DNA Sequence 


ctgacactgcactgagcactttatgagcttgaactctgttaato:tcacgaccacctcatgagactc 


TCCAGAAAGAGCAACAGTAATGGAGTACATGAGCACTGGAAGT<^CAATAAAG7^AGAGATTGATTTA 
TTAATTAAACATTTAAATGTGTCTGATGTAATAGACATTATGGAAAATCTTTATGCAAGTGAAGAGC 
CAGCAGTTTATGAACCCAGTCTAATGACCATGTGTCAAGACAGTAATCAAAACGATGAGCXjTTCTAA 
GTCTCTGCTGCTTAGTGGCCAAGAGGTACCATGGTTGTCATCA^ 

TTGCTTGCTTTTGCAAACCATATATCCAACACTGCAAAGCATTTTTATGGACAACGACCACAGGAAT 
CTGGAATTTTATTAAACATGGTCATCACTCCCCAAAATGGACGTTACCAAATAGATTCCGATGTTCT 
CCTGATCCCCTGGAAGCTGACTTACAGGAATATTGGTTCnXSATTTTATTTCTCGGGGCGCCTTTGGA 
AAGGTATACTTGGCACAAGATATAAAGACIGAAGAAAAGAATGGCGTGTAAACTGATCCCAGTAGATC 
AAT TT AAGCC ATC TG ATG1X5GAAATCCAGGCTTG C TTC CGGC ACG AGAAC ATCGC AGAGCTGTATGG 
CGCAGTCCTCTGGGGTtSAAACTGTCCATCTCTTTATGGAAGCAGGCGAGGGAGGGTCTGTTCT 
AAACT<^ AGLAGCTGTG GACC AATC AG AG AATTTGAAAT T ATTTGGGTGAC AAAGC ATGT TC TCAAGG 
GAC TTGATTT TC T AC ACTC AAAG AAAG TG ATCC ACCATG AT AT T AAACC T AGC AAC ATTGTTTTC AT 
GTCCACAAAAGCT<3TTTTGGT<X5ATTTTGGCCTMGTG 

AAGG ACC TCCG AGG AACAGAGAT^ACATG AGCCC AGAGGTCIATC C TGTGCAGTGGC CATTC AACCA 

AAGCAGACATCTACAGCCTGGGGGCCACGCTCATCCACATOCAGACGGGCACCCCACCCTGGG 

GCGCTACCCTCGCTCAGrcTATCCCTCCTACCTOTACATAATCC^CAAGCAAGCACCTCCACTGGAA 

GACATTG C AG ATG ACTGC AG TCC AGGGATGAG AGAGC TGAT AGAAGC TTC C CTGG AGAGAAAC CCC A 

ATCACCGCCCAAGAGCCGCAGACCTACTAAAACATGAGGCCCTGAACCCGCCCAGAGAGGATCAGCC 

ACGCTGTCAGAGTCTGGACTCTGCCCTCTTGGAGCGCAAGAGGCTGCTGAGTAGGAAGGAGCTGGAA 

CTTCCTGAGAACATTGCTGATTCTTTCTGCACAGGAAGCACO^GGA^ 

AACGCTCTCTCTACATCGACCTCGGCGCTCTGGCIOTCTACTTCAATCTTCTTCGGGGA 
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r^TTRAATATGGCTOAAGGATGCCATGTTTGCTCT. 






ORFStart:ATGat87 | |ORFStop: TGA at 1488 | 





SEQ ID NO: 232 |467 aa (MW at 52844.7kD 


NOV52c, 
CG93659-02 
Protein Sequence 


MEYMSTGSDNKEE IDLL IKHLNVSDVI DIMBNLYAS EEPAVYEPSLMTMCQDSNQNDERSKSLLLSG 
QEVPWL S SVRYGTVEDLLAFANH I SNTAJKHF YGQR PQESG I LLNMVITPQNGRYQ IDSDVLL I PWKL 
TYRNI GSDFI SRGAFGKVYLAQDIKTKKRMACKI* I P VDQFKP SDVEIQACFRHENI AEli YGAVLWGE 
TVHLFMEAGEX5GSVLEKLBSCGPMREFEIIWVTKHVLKGLDFLHSKKVIHTO 
TOFGLSVQMTEI^FPKDLRGTEIYMSPEVILCSGHSTK^^ 

YPSYLYIIHKQAPPLEDIADDCSPGMRELIEASLERNPNHRPRAADLLKHEALNPPREDQPRCQSLD 
SALLERKRLLSI^ELELPENIADSSCTGSTEESEMLKRQRS 



Sequence comparison of the above protein sequences yields the following sequence 
relationships shown in Table 52B. 
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Table 52B. Comparison of NOV52a against NOV52b and NOV52c. 


Protein Sequence 


NOV52a Residues/ 
Match Residues 


Identities/ 

Similarities for the Matched Region 


NOV52b 


1..467 
1..431 


413/467 (88%) 
413/467 (88%) 


NOV52c 


1..467 
1..467 


449/467 (96%) 
449/467(96%) 



Further analysis of the NOV52a protein yielded the following properties shown in 
15 Table 52C. 



Table 52C. Protein Sequence Properties NOV52a 


PSort analysis: 


0.6500 probability located in cytoplasm; 0.1000 probability located in 
mitochondrial matrix space; 0.1000 probability located in lysosome (lumen); 
0,0000 probability located in endoplasmic reticulum (membrane) 


SignalP analysis: 


No Known Signal Sequence Predicted 



20 A search of the NOV52a protein against the Geneseq database, a proprietary 

database that contains sequences published in patents and patent publication, yielded 
several homologous proteins shown in Table 52D. 
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Table 52D. Geneseq Results for NOV52a 


Geneseq 
Identifier 


rroieiivwrgaiubiii/i^cjuguj 
[Patent*, Date] 


NOV52a 

Match 
Residues 


Identities/ 
Similarities for the 
Matched Region 


Value 


AAE05951 


Human cot oncoprotein 
encocea oy uLwyf 
oncogene - Homo sapiens, 
467 aa. [US6265216-B1, 
24-JUL-2001] 


L.467 


467/467 (100%) 


0.0 


AAY79244 


Human COT -Homo \ 
sapiens, 467 aa. 
[WO200011191-A2, 
02-MAR-2000] 


1..467 
L.467 


467/467(100%) 
467/467 (100%) 


0.0 


AAE10313 


Human Tpl2 protein - Homo 
sapiens, 467 aa. 
[WO200166559-A1, 
13-SEP-2001] 


L.467 
L.467 


466/467 (99%) 
466/467 (99%) 


0.0 


AAE10314 


Rat Tpl2 protein - Rattus sp, 
467 aa [WO200166559-A1, 
13-SEP-2001] 


L.467 
L.467 


439/467 (94%) 
454/467 (97%) 


0.0 


AAY79243 


Rat TPL-2- Rattus 
norvegicus, 467 aa. 
[WO200011191-A2, 
02-MAR-2000] 


L.467 
L.467 


438/467 (93%) 
453/467(96%) 


0.0 



5 In a BLAST search of public sequence datbases, the NOV52a protein was found to 

have homology to the proteins shown in the BLASTP data in Table 52E. 



Table 52E. Public BLASTP Results for NOV52a 


Protein j 

Accession 

Number 


Protein/Organism/Length 


NOV52a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for the 
Matched Portion 


Expect 
Value 


P41279 


Mitogen-activated protein 
kinase kinase kinase 8 (EC 
2.7. L-) (COT proto-oncogene 
serine/threonine-protein 
kinase) (C-COT) (Cancer 
Osaka thyroid oncogene) - 
Homo sapiens (Human), 467 
aa. 


L.467 
L.467 


467/467(100%) 
467/467 (100%) 


0.0 
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A48713 


serine/threonine-specific 
protein kinase cot, 58K form - 
human, 467 aa. 


1 ft 

1..467 
1..467 


fU 11 /USBlIi 
466/467 (99%) 
466/467 (99%) 


f5137i 
0.0 


Q63562 


Mitogen-activated protein 
kinase kinase kinase 8 (EC 
2.7.1.-) (Tumor progression 
locus 2) (TPL-2) - Rattus 
norvegicus (Rat), 467 aa. 


1..467 
1..467 


438/467 (93%) 
453/467(96%) 


0.0 


Q07174 


Mitogen-activated protein 
kinase kinase kinase 8 (EC 
2.7.1.-) (COT proto-oncogene 
serine/threonine-protein 
kinase) (C-COT) (Cancer 
Osaka thyroid oncogene) - 
Mus musculus (Mouse), 467 
aa. 


1..467 
1..467 


435/467 (93%) 
454/467 (97%) 


0.0 


A41253 


kinase-related transforming 
protein (EC 2.7.1.-) - human, 
415 aa. 


1..397 
1..397 


379/397(95%) 
379/397 (95%) 


0.0 



PFam analysis predicts that the NOV52a protein contains the domains shown in the 
Table 52F. 
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Table 52F. Domain Analysis of NOV52a 


Pfam Domain 


NOV52a Match Region 


Identities/ 
Similarities 

for the Matched Region 


Expect Value 


pkinase 


146..388 


74/279(27%) 
187/279 (67%) 


4.7e-54 



Example 53. 

10 The NOV53 clone was analyzed, and the nucleotide and encoded polypeptide 

sequences are shown in Table 53A. 



Table 53A- NOV53 Sequence Analysis 




SEQ ID NO: 233 


1078 bp 1 


NOV53a, 
CG94521-01 
DNA Sequence 


GCGGCTACATTCGGCCCGGCC\TG(K!!A(^ 
GGGGTTCAGCTGTTGCAAAAATAATTGOTAATAACGT^^ 

CAAGATGTGGGTCTTTGAAGAAACAGTGAATGQCAQAAAACTGACAGACATCATAAATAATGACCAT 
GAAAATGTAAAATATCTTCCTGGACACAAGCTC 




TGAGATCACTGGGAGAGTGCCOUU3AAAGCGCTGGGAATCACCCTCATC 
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CCCG^ 

TGGGAGCCAACATTGCCAATGAGGTGGCTGCAGAGAAGTTCTGTGAQACCACCATCGGCAGCAAAGT 
AATGGAGAACGGCCTTCTCTTCAAAGAACTTCTGCAGACTCCAAATTTTCGAATTACGGTGGTTGAT 
GATGCAGACACTGTTGAACTCTGTGGTGCGCTTAAGAACATCGTAGCTGTGGGAGCTGGGTTCTGCG 
ACGGC C TC CGCTGTGGAGAC AAC ACCAAAGCGGCCGTCATCCGCC TGGG ACTCATGG AAATGATTG C 
TTTTGCCAGGATCTTCTGCAAAGGCCAAGTGTCTACAGCCACCTTCCTAGAGAGCTGCGGGGTGGCC 
GACCTGATCACCACCTGTTACGGAGGGCGGAACCGCAGGGTGGCCGAGGCCTTCGCCAGAACTGGGA 
AGACCATTGAAGAGTTGGAGAAGGAGATGCTGAATGGGCAAAAGCTCCAAGGACCGCAGACTTCTGC 
TGAAGTGTACCGCATCCTCAAACAGAAGGGACTACTGGACAAGTTTCCATTGTTTACTGCAGTGTAT 



CATAAA 

ORF Start: ATG at 22 | jORF Stop: TAA at 1075 







SEQ ID NO: 234 }35 1 aa MW at 38418.3kD 


NOV53a, 
CG94521-01 
Protein Sequence 


MAAAPLKVCIVGSGNWGSAVAKI IGNNVKKLQKFASrVKMOTTFE^ INNDHENVKYLP 
GHKLPENWAMSl^SEAVQDADLLVFVIPHQFIHRICDEITGR\^KKALGITLIKGIDEGPEGLKIiI 
SDIIREKMGIDISVLMGANIANEVAAEKFCETTIGSKVM^ 

C GALKNI VAVGAGFCDGLRCGDNTKAAV IRLGLMEMI AFAR I FCKGQVSTATFLESCGVADL ITTC Y 
GGRNRRVAEAFARTGKT I EELEKEMLNGQKLQG PQTS AEVYRI LKQKGLLDKFPLFTAVYQ I CYESR 
PVQEMLSCLQSHPEHT 






SEQ ID NO: 235 936 bp f ] 


NOV53b, 
CG94521-03 
DNA Sequence 


TACATTCGGCCCGGCCATiMCAGCGGCGCCCCTGAAA^ 

TCAGCTGTTGCAAAAATAATTGGTAATAATGTCAAGAAACTTCAGAAATTTGCCTCCACAGTCAAGA 
TGTGGGTCTTTCAAGAAACAGTGAATGGCAGAAAACTX3ACAGACATCATAAATAATGACCATGAAAA 
TGTAAAATATCTTCCTGGACACAAGCTGCCAGAAAATGTGGGCATAGACGAGGGCCCCGAGGGGCTG 
AAGCTCATTTCTGACATCATCCGTGAGAAGATGGGTATTGACATCAGTGTGCTGATGGGAGCCAACA 
TTGC CAATGAGGTGGC TG C AG AG AAGT TC TGTG AGACCACCATC GGC AGCAAAGTAATGGAGAACGG 
CC TTCTC TTCAAAGAAC TTCTGC AGACTCC AAATTTTCGAATTACCGTGGTTGATGATGCAGACACT 
GTTGAACTCTGTGGTGCGCTTAAGAACATCGTAGCTGTGGGAGCTGGGTTCTGCGACGGCCTCCGCT 
GTGGAGACAACACCAAAGCGGCCGTCATCCGCCTGGGACTCATGGAAATGATTGCTTTTGCCAGGAT 
CTTCTGCAAAGGCCAAGTGTCTACAGCCACCTTCCTAGAGAGCTGCGGGGTGGCCGACCTGATCACC 
ACCTGTTACGGAGGGCGGAACCGCAGGGTGGCCGAGGCCTTCGCCAGAACTGGGAAGACCATTGAAG 
AGTTGGAGAAGGAGATGCTGAATGGGCAAAAGCTCCAAGGACCGCAGACTTCTGCTGAAGTGTACCG 
CATCCTCAAACAGAAGGGACTACTGGACAAGTTTCCATTGTTTACTGCAGTGTATCAGATCTGCTAC 
GAAAGCAGACCAGTTCAAGAGATGTTGTCTTGTCTTCAGAGCCATCCAGAGCATACATAAAAAGG 




ORF Start: ATG at 17 | |oRF Stop: TAA at 929 





SEQ ED NO: 236 |304 aa |MW at 33235.2kD 


NOV53b, 
CG94521-03 
Protein Sequence 


MAAAPLKVCIVGSGNWGSAVAKI IGNNVKKLQKFASTVKMWVFEETVNGRKLTDI INNDHENVKYLP 
GHKLPENVGIDEGPEGLKLI SDIIREKMGIDI SVLMGANIANEVAAEIC?CETTIGSKVMENGLLFKE 
LLQTPNFRITVVDDADWELCGALKNIVAVGAGFC1X3L^ 

VSTATFLESCGVADL I TTC YGGRNRRVAEAFARTGKTI EELEKEMLNGQKLQGPQTS AEVYR ILKQK 
GLLDKFPLFTAVYQI CYESR PVQEMLSCLQSHPEHT 



1 |SEQIDNO:237 |l077bp 
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NOV53c, 
CG94521-02 
DNA Sequence 



TACATTCGGCCCGGCCA TGGCAGCGGCGCCCCTG, 



TCAGCTGTTGCAAAAATAATTGGTAATAATGTCAAGAAACTTCAGAAATTTGCCTCCACAGTCAAGA 
TGTGGGTCTTTGAAGAAACAGTGAATGGCAGAAAACTGACAGACATCATAAATAATGACCATGAAAA 
TGTAAAATATCTTCCTGGACACAAGCTGCCAGAAAATGTGGTTGCCATGTCAAATCTTAGCGAGGCT 



TCACTGGGAGAGTGCCCAAGAAAGCGCTCGGAATCACCCTCATCAAGGGCATAGACGAGGGCCCCGA 
GGGGCTGAAGCTCATTTCTGACATCATCCGTGAGAAGATGGGTATTGACATCAGTGTG^ 
GC C AACATTGCCAATGAGGTGGCTGCAGAGAAGTTCTGTGAGACC ACCATCGG CAGCAAAGT AATGG 
AGAACGGCCTTCTCTTCAAAGAACTTCTGCAGACTCCAAATTTTCGAATTACCGTGGTTGA 
AGACACTGTTGAACTCTGTGGTGCGCTTAAGAACATCGTAGCTGTGGGAGCTGGGTTCTGCGACGGC 
CTCCGCTGTGGAGACAACACCAAAGCGGCCGTCATCCGCCTGGGACTCATGGAAATGA 
CCAGGATCTTCTGCAAAGGCCAAGTGTCTACAGCCACCTTCCTAGAGAGCTGCGGGGTGGCCGACCT 
GATCACCACCTGTTACGGAGGGCGGAACCX3CAGGGTGGCCGAGGCCTTCGCCAGAACTGGGAAGACC 
ATTGAAGAGTTGGAGAAGGAGATGCTGAATGGGCAAAAGCTCCAAGGACCGCAGACTTCTGCTGAAG 
TGTACCGCATCCTCAAACAGAAGGGACTACTGGACAAGTTTCCATTGTTTACTGCAGTGTATCA6AT 
CTGCTACGAAAGCAGACCAGTTCAAGAGATGTTGTCTTGTCTTCAGAGCCATCCAGAGCATACATAA 
AAA£g 



ORF Start: ATG at 17 



QRF Stop: TAA at 1070 





SEQ ID NO: 238 |351 aa |MW at 38418.3kD 


NOV53c, 
CG94521-02 
Protein Sequence 


MAAAPLKVC I VGSGNWGSAVAKI IGNNVKKLQKFASTVKMWVFEETVNGRKLTDI INNDHENVKYLP 
GHKL PENWAMSNL SEAVQDADLLVFV I PHQF I HRICDE I TGRVPKKALGI TL IKGIDEGPEGLKL I 
SDI IREKMGIDISVLMGANIANEVAAEKFCETTI 

CGALKNI VAVGAGTCDGLRCGDNTKAAVIRLGLMEMIAFAR I PCKGQVSTATFLESCGVADL ITTC Y 
GGRNRRVAEATARTGKTI EELEKEMLNGQKI/QG PQTSAEVYR ILKQKGLLDKFPLFTAVYQ I CYESR 
PVQEMLSCLQSHPEHT 



Sequence comparison of the above protein sequences yields the following sequence 
relationships shown in Table 53B. 



Table 53B. Comparison of NOV53a against NOV53b and NOV53c. 


Protein Sequence 


NOV53a Residues/ 
Match Residues 


Identities/ 

Similarities for the Matched Region 


NOV53b 


1..351 
1..304 


304/351 (86%) 
304/351 (86%) 


NOV53c 


1..351 
1..351 


351/351 (100%) 
351/351(100%) 



Further analysis of the NOV53a protein yielded the following properties shown in 
15 Table 53C. 



Table 53C. Protein Sequence Properties NOV53a 
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PSort analysis: 


1 ynErffy UHHi Q jjj^ '"'jl jt HI "7 j 

0.6500 probability located in cytoplasm; 0. 1000 probability locatedln 4 ' 
mitochondrial matrix space; 0.1000 probability located in lysosome (lumen); 
0.0000 probability located in endoplasmic reticulum (membrane) 


SignalP analysis: 


Cleavage site between residues 22 and 23 
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A search of the NOV53a protein against the Geneseq database, a proprietary 
database that contains sequences published in patents and patent publication, yielded 
several homologous proteins shown in Table 53D. 



Table 53D. Geneseq Results for NOV53a 


Geneseq 
Identifier 


Protein/Organism/Length 
[Patent #, Date] 


NOV53a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for 
the Matched 
Region 


Exoect 
Value 


ABB64184 


Drosophila melanogaster 
polypeptide SEQ ID NO 
19344 -Drosophila 
melanogaster, 360 aa. 
[WO200171042-A2, 
27-SEP-2001] 


3..350 
Z.349 


212/349 (60%) 
263/349 (74%) 


e-120 


AAG08446 


Arabidopsis thaliana protein 
fragment SEQ ID NO: 5988 - 
Arabidopsis thaliana, 366 aa. 
[EP1033405-A2, 
06-SEP-2000] 


7..331 
22.349 


180/329 (54%) 
233/329 (70%) 


8e-95 


AAG08445 


Arabidopsis thaliana protein 
fragment SEQ ID NO: 5987 - 
Arabidopsis thaliana, 400 aa. 
[EP1033405-A2, 
06-SEP-2000] 


7..331 
56..383 


180/329(54%) 
233/329 (70%) 


8e-95 


AAG08444 


Arabidopsis thaliana protein 
fragment SEQ ID NO: 5986 - 
Arabidopsis thaliana, 421 aa. 
[EP1033405-A2, 
06-SEP-2000] 


7..331 
77..404 


180/329(54%) 
233/329(70%) 


8e-95 


AAG39422 


Arabidopsis thaliana protein 
fragment SEQ ID NO: 48774 
- Arabidopsis thaliana, 366 
aa. [EP1033405-A2, 
06-SEP-2000] 


7..331 
22..349 


180/329(54%) 
232/329 (69%) 


le-94 
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In a BLAST search of public sequence datbases, tfte^d\^^ 
have homology to the proteins shown in the BLASTP data in Table 53E. 



Table 53E. Public BLASTP Results for NOV53a 


Protein 

Accession 

Number 


Protein/Organism/Length 


NOV53a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for the 
Matched Portion 


Expect 
Value 


AAH28726 


KIAA0089 protein - Homo 
sapiens (Human), 351 aa. 


1..351 
1..351 


351/351 (100%) 
351/351(100%) 


0.0 


Q14702 


K1AA0089 protein - Homo 
sapiens (Human), 411 aa 
(fragment). 


1..351 
61.411 


351/351 (100%) 
351/351 (100%) 


0.0 


057656 


Glycerol-3-phosphate 
dehydrogenase [NAD+], 
cytoplasmic (EC 1.1.1.8) 
(GPD-C) (GPDH-C) - Fugu 
rubripes (Japanese 
pufferfish) (Takifugu 
rubripes), 351 aa. 


3..350 
2..350 


265/349 (75%) 
306/349 (86%) 


e-155 


Q98SJ9 


Glycerol-3-phosphate 
dehydrogenase (EC 1.1.1.8) - 
Salmo salar (Atlantic 
salmon), 350 aa. 


7..350 
5..349 


258/345 (74%) 
301/345 (86%) 


e-152 


AAH32234 


Glycerol-3-phosphate 
dehydrogenase 1 (soluble) - 
Homo sapiens (Human), 349 
aa. 


4..350 
2..348 


249/347(71%) 
297/347 (84%) 


e-149 



5 



PFam analysis predicts that the NOV53a protein contains the domains shown in the 
Table 53F. 

10 



Table 53F. Domain Analysis of NOV53a 


Pfam Domain 


NOV53a Match 
Region 


Identities/ 
Similarities 

for the Matched Region 


Expect Value 


NAD_Gly3P_dh 


5..344 


167/365 (46%) 
307/365 (84%) 


2.1e-184 
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Example 54. 

The NOV 54 clone was analyzed, and the nucleotide and encoded polypeptide 
sequences are shown in Table 54A. 
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Table 54A. NOV54 Sequence Analysis 




SEQ ID NO: 239 


1552 bp j 


NOV 54a 


T T ATTC C C C AC TT T ACC TGGC T AATTG AAG TG T AAC AAAAGCTTC ATC C AGG AAC A TTGGC GCGGG A 


AACCTGGCGTACTGGCTGTGGCTTCTCTAGCGGGACTCGGCATGAGGCTGGCGCGGCTGCTTCGCGG 


CG96613-01 
DNA Sequence 


AGCCGCCTTGGCCGGCCCGGGCCCGGGGCTGCGCGCCGCCGGCTTCAGCCGCAGCTTCAGCTCGGAC 

tcgggctccagcccggcgtccgagcgcggcgttcccggccaggtggacttctacgcgcgcttctcgc 

cgtccccgctctccatgaagcagttcctggacttcggatcagtgaatgcttgtgaaaagacctcatt 

tatgtttctgcggcaagagttgcctgtcagactggcaaatataatgaaagaaataagtctccttc 

gataatcttctcaggacacc^tccgttcaattggtacaaagc^gtatatccagagtcttcaggagc 

ttcttgattttaaggacaaaagtgctgaggatgctaaagctatttatgactttacagatactgtgat 

acggatcagaaaccgacacaatgatgtcattcccacaatggcccagggtgtgattgaatacaaggag 

agctttggggtggatcctgtcaccagccagaatgttcagtactttttggatcgattctacatgagtc 

gcatttcaattagaatgttactcaatcagcactctttattgtttggtggaaaaggcaaaggaagtcc 

atctcatcgaaaacacattggaagcataaatccaaactgcaatgtacttgaagttattaaagatggc 

tatgaaaatkk:taggcgtctgtgtgatttgtattatattaactctcccgaactagaacttgaagaac 

taaatgcaaaatcaccaggacagcgaatacaagtggtttatgtaccatcccatctctatcacatggt 

gtttgaacttttcaagaatgcaatgagagccactatggaacaccatgcauvcagaggtgtttacccc 

cctattcaagttcatgtcacgctgggtaatgaggatttgactgtgaagatgagtgaccgaggaggtg 

gcgttcctttgaggaaaattgacagacttttcaactacatgtattcaactgca 

tgaglagc tcccgcgcagtgcctctggctggttt^ tttacgca 

caatacttccaaggagacctgaagctgtattccctagagggttacgggacagatgcagttatctaca 

ttaaggctctgtcaacagactcaatagaaagactcccagtgtataacaaagctgcctggaagcatta 

caacaccaaccacgaggctgatgactggtgcgtccccagcagagaacccaaagacatgacgacgttc 

rc^agtgcctaoacacactggggacatcggaaaatccaaatgtggcttttg 




TATGGTGTTCAGAACTATATTATACCAAGTACTTTATTTATCGTTTTCACAAAACTATTTGAGTAGA 




ATAAATG G AAA 




ORF Start: ATG at 109 


jORF Stop: TAG at 1417 





SEQ ID NO: 240 |436 aa |MW at 49243.6kD 


NOV54a, 
CG96613-01 
Protein Sequence 


MRIJOILLRGAALAGPGPGLRAAGFSRSFSSDSGSSPASERGVPGQVDFYARFSPSPLSMKQFLDFGS 
VNACEKTSFOTLRQEXPVRIJ^IMKEISLLPDNLLRTPSVQLVQSWYIQSLQELLDFKDKSAEDAKA 
I YDFTDTV IR I RNRHNDVI PTMAQGVI EYKESFGVDPVTSQNVQYFIjDRF YMSRI SIRMLLNQHSLL 
FGGKGKGSPSHRKHIGSINPNOWLEVIKrXSYENARRL^ 
VPSHLYHMTFEIjFKNAMRATMEHHANRGVYPPIQVHVTLGNED 

YS T APRPRVET SRAVPLAGFGYGL P I SRL YAQYFQGDLKLY SLEGYGTDAVI YIKAL STDS I ERLPV 
YNKAAWKHYNTNHEADDWCVP SREPKDMTTFRS A 






SEQ ID NO: 241 1612 bp j 


NOV54b, 
CG96613-03 
DNA Sequence 


TTATTCCCCACTTTACCTGGCTAATTGAAGTGTAACAAAAGCTTCATCCAGGAACATTGGCGCGGGA 


AACCTGGCGTACTGGCTGTGGCTTCTCFAGCGGGACTC 


AGCCGCCTTGGCCGGCCCGGGCCCGGGGCTGCGCGCCGCCGGCTTCAGCCGCAGCTTCAGCTCGGAC 

TCGGGCTCCAGCCCGGCGTCCGAGCGCGGCGTTCCGGGCCAGGTGGACTTCTACGCGCGCTTCTCGC 

CGTCCCCGCTCTCCATGAAGCAGTTCCTGGACTTCGGATCAGTGAATGCTTGTGAAAAGACC 

TATGTTTCTGCGGCAAGAGTTGCCTGTCAGACTGGCAAATATAATGAAAGAAATAAGTC 

GATAATCTTCTCAGGACACCATCCGTTCAATTGGTACAAAGCTGGTATATCCAGAGTCTTCAGGAGC 

TTC T TG ATTTT AAGGACAAAAGTGC TG AG G ATCCT AAAGCT ATTT ATGAAAGG CC T AG AAG AACATG 

GTTGCAGGTCTCTAGTTTATGCTGTATGGCCTGCAAGATGATCTTTACAGATACTGTGA 

AGAAACCGACACAATGATGTCATTCCCACAATGGCCCAGGGTGTGATTGAATACAAGGAGAGCTT 

GX3GTGGATCCTGTCACCAGCCAGAATGTTCAGTACTTTTTGGATCGATTCTACATGAGTCGCATTTC 
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cgaa^cacattggaagcataaatccaaactgcaatgtacttgaagttattaaagatggctatgaaa 
atgctaggcgtctgtgtgatttgtattatattaactctcccgaactagaacttgaagaactaaatgc 
aaaatcaccaggacagccaatacaagtggtttatgtaccatcccatctctatcacatggtgtttgaa 

CTTTTCAAGAATGCAATGAGAGCCACTATGGAACACCATGCCAACAGAGGTGTTTACCCCCCTATTC 
AAGTTC ATGTCACGC TGGGTAATG AGGATTTGACTGTGAAGATGAGTG ACCG AG GAGGTGGCGTTCC 
TTTGAGGAAAATTGACAGACTTTTC^CTACATGTATTCAACTGCACCAAGACCTCGTGTTGAGACC 
TCCCGCGCAGTGCCTCTGGCTGGTTTTGGTTATGGATTGCCCATATCACGTCTTTACGCACAATACT 
TC CAAGGAGACCTGAAGC TGTATTCCCTAGAGGGTTACGGGACAGATGC AGTTATCT AC ATTAAGGC 
TC TGTCAACAGACTC AATAGAAAGACTCCC AGTGT ATAAC AAAGC TGCC TGGAAGCATTAC AAC ACC 
AACCACGAGGCTGATGACTGGTGCGTCCCCAGCAGAGAACCCAAAGACATGACGACGTTCCGCAGTG 
CCTA GACACACTGGGQACATCGGAAAATCCAAATGTGGCTTTTGTATTAAATTTGGAAGGTATGGTQ 
TTCAGAACTATATTATACCAAGTACTTTATTTATCGTTTTCACAAAACTATTTGAGTAGAATAAATG 
GAAA 

ORF Start: ATG at 109 j |ORF Stop: TAG at 1477 





SEQIDNO:242 j456aa 


MWat51622.6kD 


NOV54b, 
CG96613-03 
Protein Sequence 


MRL ARLIiRGAALAGPGPGLRAAGFSRSFSSDSGS S PASERGVPGQVDFYARFS P S PL SMKQFLDFGS 
VNACEKTSFMFIiRQELPVRLANIMKEI SLL PDNLLRTPSVQLVQSWY I Q SLQELLDFKDKSAEDAKA 
I YERPRRTWLQVS SLCCMACKM I FTDTVIR IRNRHNDVI PTMAQGVI EYKBS FGVDPVTSQNVQYFL 
DRFYMSRISIRMLLNQHSLLFGGKGKGSPSHRKHIGSM 

ELELEELNAKSPGQPIQVVYVPSHLYHMVFELFKNAMRATMEHHANRGVYPPIQVH^ 
MSDRGGGVPLRKIDRLFNYMYSTAPRPRVETSRAVPI^ 

TDAVI YIKAL STDSIERL PVYNKAAWKHYNTNHEADDWCVPSREPKEMTTFR SA 





SEQ ID NO: 243 (967 bp 


NOV54c, 
CG96613-02 
DNA Sequence 


TTATTC C CC AC T TT ACCTG G C TA AT TG AAG TGT AAC AAAAGC T T CATC C AGG AAC ATTGG C GCGGGA 


AACCTGGCGTACTGGCTGTGGCTTCTCTAGCGGGACTCGGCATGAGGCTGGCGCGGCTGCTTrGCTRG: 


AGCCGCCTTGGCCGGCCCGGGCCCGGGGCTGCGCGCCGCCGGCTTCAGCCGCAGCTTCAGCTCGGAC 

TCGGGCTCCAGCCCGGC<5TCCGAGCGCGGCGTTCCGGGCCAGGTGGACTTCTACGCGCGCTTCTCGC 

CGTCCCCGCTCTCC ATGAAGCAGTTC C TGG ACTTCGGATCAGTGAATGCTTGrGAAAAGACCTCATT 

TATGTTTCTGCGGCAAGAGTTGCCTGTCAGACTGGCAAATATAATGAAAGAAATAAGTCTCCTTCCA 

GATAATCTTCTCAGGACACCATCCGTTCAATTGGTACAAAGCTGGTATATCCA 

TTCTTGATTTTAAGGACAAAAGTGCTGAGGATGCTAAAGCTATTTATGAAAGGCCTAGAAGAACATG 

GTTGCAGGTCTCTAGTTTATGC TGTATGGCC TGCAAGATGATCTTTACAGAT ACTGTGATACGGATC 

AGAAACCGACACAATGATCTCATTCCCACAATGGCCCAGGGTGTGATTGAATACAAGGAGAGCTTTG 

GGGTGGATCCTGTCACCAGCCAGAATGTTCAGTACTTTATTTATCGTTTTCACAAAACTATTTGAGT 

AGAATAAATGGAAACTGAATTCCATTTGTGCC03TTAAACCTCCTAAAGGATGAAATTGCACCTATT 


TTACACCTATATTTTCACAGTTAATTGAACATATTTTTAAACAACTGTAGTT T TGGGCAACTTTTCA 


CTTTGTGGTAGACTTCAGAAGTGTGGAAATCTTCGGGTTTCTATAGGAAACTAGTTTTTTTT 


AAAAAAATCCTTTCTTTTTTGTGGGCTAG 




ORF Start: ATG at 109 |oRF Stop: TGA at 733 





SEQ ID NO: 244 |208 aa |MW at 23483.8kD 


NOV54c, 
CG96613-02 
Protein Sequence 


MRLARLLRGAALAGPGPGLRAAGFSRSFSSDSGSSPASERGVTC^ 

VNACEKTSFMFI^QELPVRLANIMKEISLLPDNLLRTPSVQLVQSWYIQSLQELLDFKDKSAEDAKA 
I YERPRRTWLQVSSLCCMACKMI FTDTVIR IRNRHNDVI P TMAQGVI EYKESFGVDPVTSQWVQYF I 
YRFHKTI 
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Sequence comparison of the above protein sequences yields theTfollowing sequence 
relationships shown in Table 54B. 



Table 54B. Comparison of NOV54a against NOV54b and NOV54c 


Protein Sequence 


NOV54a Residues/ 
Match Residues 


Identities/ 

Similarities for the Matched Region 


NOV54b 


42..436 
42..456 


394/415 (94%) 
395/415 (94%) 


NOV54c 


42..185 
42..205 


140/164(85%) 
143/164(86%) 
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Further analysis of the NOV54a protein yielded the following properties shown in 
Table 54C. 
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Table 54C. Protein Sequence Properties NOV54a 


PSort analysis: 


0.425 1 probability located in mitochondrial matrix space; 0.3802 probability 
located in microbody (peroxisome); 0.1914 probability located in lysosome 
(lumen); 0.1017 probability located in mitochondrial inner membrane 


SignalP analysis: 


Cleavage site between residues 22 and 23 



A search of the NOV54a protein against the Geneseq database, a proprietary 
database that contains sequences published in patents and patent publication, yielded 
15 several homologous proteins shown in Table 54D. 



Table 54D. Geneseq Results for NOV54a 


Geneseq 
Identifier 


Protein/Organism/Length 
[Patent #, Date] 


NOV54a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for 
the Matched 
Region 


Expect 
Value 


ABG16621 


Novel human diagnostic 
protein #16612 - Homo 
sapiens, 415 aa. 
[WO200175067-A2, 
ll-OCT-2001] 


42..43S 
21 ..413 


269/395 (68%) 
331/395 (83%) 


e-162 
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ABB58044 


Drosophila melanogaster 
polypeptide SEQ ID NO 924 
- Drosophila melanogaster, 
413 aa. [WO200171042-A2, 
27-SEP-2001] 


II- 

26..420 
2..396 


fT/USOB, 

219/401 (54%) 
288/401 (71%) 


/■3137, 

e-121 


AAE07838 j 


Maize pyruvate 
dehydrogenase kinase 
(PDK)-2 - Zea mays, 364 aa. 
[US6265636-B1, 
24-JUL-2001] 


40..401 
8..364 


144/374(38%) 
211/374(55%) 


2e-60 


AAW64724 


A. thaliana PDHK protein 
from clone YA5 - 
Arabidopsis thaliana, 366 aa. 
[WO9835044-A1, 
13-AUG-1998] 


57..401 
29..366 


142/357 (39%) 
209/357 (57%) 


3e-58 


AAE07837 


Maize pyruvate 
dehydrogenase kinase 
(PDK)-1 - Zea mays, 347 aa. 
[US6265636-B1, 
24-JUL-2001] 


40..401 
8..347 


135/371 (36%) 
205/371 (54%) 


4e-56 


In a BLAST search of public sequence datbases, the NOV54a protein was found to 
have homology to the proteins shown in the BLASTP data in Table 54E. 


Table 54E. Public BLASTP Results for NOV54a 


Protein 

Accession 

Number 


Protein/Organism/Length 


NOV54a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for the 
Matched Portion 


Expect 
Value 


Q15118 


[Pyruvate dehydrogenase 
[lipoamide]] kinase isozyme 
1, mitochondria] precursor 
(EC 2.7.1.99) (Pyruvate 
dehydrogenase kinase isoform 
1) - Homo sapiens (Human), 
436 aa. 


L.436 
1..436 


436/436(100%) 
436/436(100%) 


0.0 


Q63065 


[Pyruvate dehydrogenase 
[lipoamide]] kinase isozyme 
1, mitochondrial precursor 
(EC 2.7.1.99) (Pyruvate 
dehydrogenase kinase isoform 
1)(PDKP48)-Rattus 
norvegicus (Rat), 434 aa. 


L.436 

L.434 j 


402/436(92%) 
412/436(94%) 


0.0 
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Q8R2U8 


Similar to pyruvate 
dehydrogenase kinase, 
isoenzyme 1 - Mus musculus 
(Mouse), 432 aa. 


r J 

1..436 
1..432 


fCT/USOB 
401/436 (91%) 

412/436 (93%) 


/■3137, 
0.0 


Q15119 


[Pyruvate dehydrogenase 
[lipoamide]] kinase isozyme 
2, mitochondrial precursor 
(EC 2.7,1.99) (Pyruvate 
dehydrogenase kinase isoform 
2) - Homo sapiens (Human), 
407 aa. 


37..434 
11. .405 


277/398 (69%) 
340/398 (84%) 


e-168 


170159 


[pyruvate dehydrogenase 
(lipoamide)] kinase (EC 
2.7.1.99) 2 -human, 407 aa. 


37..434 
11. .405 


276/398 (69%) 
340/398(85%) 


e-168 
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PFam analysis predicts that the NOV54a protein contains the domains shown in the 
Table 54F. 



Table 54F, Domain Analysis of NOV54a 


Pfam Domain 


NOV54a Match Region 


Identities/ 
Similarities 
for the Matched 
Region 


Expect Value 


HATPase.c 


268..393 


32/134 (24%) 
84/134 (63%) 


8.5e-20 



Example 55. 

The NOV55 clone was analyzed, and the nucleotide and encoded polypeptide 
sequences are shown in Table 55A. 



Table 55A.NOV 


55 Sequence Analysis 


NOV55a, 
CG96736-01 
DNA Sequence 


SEQIDNO:245 |2885bp | 

TACACAGGGGCCCGCATCCC ACCCTCCCGGACC TAAOAGCCTGGGTCCCCTGTTTCCGGARTPrfiP T 
p££^CCCCCAGATTCTGGCATCCCA^^ 
2S32iSa3^CAG^^ 




GGOCACTCAACCTCCTGGAGCCAAGGGCCCPACGTCCCACCCAGAGAAACTCTCGTATTCCCAGr^ 
GACSAAGCCTCAGCTCC^GCX:^^ 

GCTGTGAACTCACAACTCTAAGGAGCCCTTCAAAGTTCCAGTCTC 

TCCTAGGAACGTCGGGTCCTCGGAAGGAGCr^ 

CCCGGTGCTTCCCATC^ 

GTTCCCOTGAC^TG^ 
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GGTGGCCGGO^ 

GCGCTTGAGGCCTTCGTCTTCCCGGGCGAGCTGCTGCTGCGTCTGCTGCGGATGATCATCTTGCCGC 
TGGTGGTGTGCAGCTTGATCGGCGGCGCCGCCAGCCTGGACCCCGGCGCGCTCGGCCGTCTGGGCGC 
C TGGGCGCTGC TC TTTTTC CTGGTCACCACGC TGCTGGCGTCGGCGCTCGG AGTGGGC TTGGCGCTG 
GCTCTGCAGCCGGGCGCCGCCTCCGCCGCCATCAACGCCTCCGTGGGAGCCGCGGGCAGTGCCGAAA 
ATGCC CCC AGCAAGGAGGTGC TCGATTCGTTC CTGGATCTTGCGAGAAATATCTTCCCTTCC AACCT 
GGTGTCAGCAGCCTTTCGCTCATACTCTACCACCTATGAAGAGAGGAATATCACCGGAACCAGGGTG 
AAGGTGCCCGTGGGGCAGGAGGTGGAGGGGATGAACATCCTGGGCTTGGTAGTGTTTGCCATCGTCT 
TTGGTGTGGCGCTGCGGAAGCTGGGGCCTGAAGGGGAGCTGCTTATCCGCTTCTTCAACTCCTTCAA 
TGAGGCCACCATGGTTCTGGTCTCCTGGATCATGTGGTACGCCCCTGTGGGCATCATGTTCCTGGTG 
GCTGGCAAGATCGTGGAGATGGAGGATGTGGGTTTACTCTTTGCCCGCCTTGGCAAGTACATTCTGT 
GCTGCCTGCTGGGTCACGCCATCCATGGGCTCCTCGTACTGCCCCTCATCTACTTCCTCTTCACCCG 
CAAAAACCCCTACCGCTTCCTGTGGGGCATCGTGACGCCGCTGGCCACTGCCTTTGGGACCTCTTCC 
AGTTCCGCCACGCTGCCGCTGATGATGAAGTGCGTGGAGGAGAATAATGGCGTGGCCAAGCACATCA 
GCCGTTTCATCClWCCATCGGCGCCACCGTCAACATGGACGGTGCaSCGCTCTTCCAGTGCGTGGC 
CGCAGTGTTCATTGCACAGCTCAGCCAGCAGTCCTTGGACTTCGTAAAGATCATCACCATCCTGGTC 
ACGGCCACAGCGTCCAGCGTGGGGGCAGCGGGCATCCCTGCTGGAGGTGTCCTCACTCTGGCCATCA 
TCCTCG AAGCAGTCAACCTCC CGGTCGACCATATC TCC TTGATC C TGGCTGTGGACTGGCTAGTCGA 
CCGGTCCTGTACCGTCCTCAATGTAGAAGGTGACGCTCTGGGGGCAGGACTCCTCCAAAATTATGTO 
GACCGTACGGAGTCGAGAAGCACAGAGCCTGAGTTGATACAAGTGAAGAGTGAGCTGCCCCTGGATC 
CGCTGCCAGTCCCCACTGAGGAAGGAAACCCCCTCCTCAAACACTATCGGGGGCCCGCAGGGGATGC 
CACGGTCGCCTCTGAGAAGGAATCAGTCATGTA AACCCCGGGAGGGACCTTCCCTGCCCTGCTGGGG 



GTGCTCTTTGGACACTGGATTATGAGGAATGGATAAATGGATGAGCTAGGGCTCTGGGGGTCTGCCT 



GCACACTCTGGGGAGCCAGGGGCCCCAGCACCCTCCAGGACAGGAGATCTGGGATGCCTGGCTGCTG 



GAGTACATGTGTTCACAAGGGTTACTCCTCAAAACCCCCAGTTCTCACTCATGTCCCCAACTCAAGG 



CTAGAAAACAGCAAGATGGAGAAATAATGTTCTGCTGCGTCCCCACD3TGACCTGCCTGGCCTCCCC 



TGTCTCAGGGAGCAGGTCACAGGTCACCATGGGGAATTCTAGCCCCCACTGGGGGGATGTTACAACA 



CCATGCTGGTTATTTTGGCGGCTGTAGTTGTGGGGGGATGTGTGTGTGCACGTGTGTGTGTGTGTGT 



GTGTGTGTGTGTGTGTGTGTTCTGTGACCTCCTGTCCCCATGGTACGTCCCACCCTGTCCCCAGATC 



CCCTATTCCCTCCACAATAACAGAAACACTCCCAGGGACTCTGGGGAGAGGCTGAGGACAAATACCT 



GCTGTCACTCCAGAGGACATTTTTTTTAGCAATAAAATTGAGTGTCAACTATTAAAAAAAAAAAAAA 



AAAA 



ORF Start: ATG at 620 | jORF Stop: TAA at 2243 





SEQ ID NO: 246 |541 aa 


MW at 56620.6kD 


NOVSSa, 
CG96736-01 
Protein Sequence 


MVADPPRDSKGLAAAEPPPTGAWQLASI EDO^AAAGGYCGSRDLVRRCLRANLLVLLTVVAVVAGVA 
LiGIiGVSGAGGAIiALG PGALEAFVFPGELLLRLLRMI I L PLWC SL IGG AAS LDPGALGRLG AWALLF 
FLVTTLlASAIXJVGLArJUJQPGAASAAINASVGAAGSAENAPSKEVLDSFLDLARNIFPSNLVSA^ 
RSYSTTYEFJ^ITGTRVKVPVGQEVEGMNILGL^ 

LVSW IMWYAPVG IMF L V AGR I VEMED VGL L FARLGK Y I LC CLLGHAI HGLL VL PL IYFLFTRKNPYR 
FLWGIVTPI.ATAFGTSSSSATLPLMMKCVEENNGVAKHISRFILPIGATVNMDGAALFQCVAAVFIA 
QLSQQSLDFVKI ITILVTATASSVGAAGIPAGGVLTLAI ILEAVNLPVDHISLILAVDWLVDRSCTV 
LNVEGDALGAGLLQim^RTESRSTEPELIQVKSELPLDPLPVPTEEGNPLLKHYRGPAGDAWASE 
KESVM 





SEQ ID NO: 247 |2017bp j 


NOV55b, 
CG96736-02 
DNA Sequence 


CGTACAACTCCGCCCCATTGACGCAAATGGGCGGTAGGCGTGTACGGTGGGAGGTCTATATAAGCAG 


AGCTCTCTGGC TAACTAGAGAACCCACTGC TTACTGGCTTATCGAAATTAATACGACTCACTATAGG 


GAG AC C C AAGC TG G C T AG CGTTT AAAC TT AAG CTTG G T AC CG AGC TC GGATCC AC TAGTC CAGTGTG 

GTGGAATTCCACCATGGTGGCCGATCCTCCTCGAGACTCCAAGGGGCTCGCAGCGGCGGAGCCCACC 

GCCAACGGGGGCCTGGCGCTGGCCTCCATCGAGGACCAAGGCGCGGCAGCAGGCGGCTACTGCGGTT 

CCCGGGACCAGGTGCGCEGCTGCCTTCGAGCCAACCTGCTTGT^^ 

G^CGG<:GTG<3CGCTGGGACTGGGGGTGTCGGGGGCCGGGGGTGC 

TTGAGCGCCTTCGTCTTCCCGG<3CGAGCTGCTGCTGCGTCTGCTGCGGATGATCATCTTGCCGCTG^ 
TGGTGTGCAGCTTGATCGGCGGCGCCGCCAGCCTGGACCCCGGCGCGCTCGGCCGTCTGGGCGCCTG 




CTGCAGCCGGGCGCCGCCTCCGCCGCC^TCAACGCCTCCGTGGGAGCCGCGGGCAGTGCCGAAAATG 
CCCCC AGCAAGGAGGTGCTCGATTCGTTCC TGGATCTTGCGAGAAATATCTTCCCTTCC AACC TGGT 
GTCAGCAGCCOTTCGCTCATACTCTACCACCTATGAAGAGAGGAATATCACCGGAACCAGGGTGAAG 
GTGCCCGTGaSGCAGGAGGTGGAGGGGATGAACATCCTGGGCTTGGTAGTGTTTGCCATCGTCTTTO 
GTGTGGCGCTGCGGAAGCTGGGGCCTGAAGGGGAGCTGC TTATCCGC TTCTTCAACTCCTTC AATGA 
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^^^^^^^ 



GGCCACCATGGTTCTGGTCTCCTGGATCATGTGGT. 
GGCAAGATCGTGGAGATGGAGGATGTGGGTTTACTCTTTGCCCGCCTTGGCAAGTACATTCTGTGCT 
GCCTGCTGGGTCACGCCATCCATGGGCTCCTGGTACTGCCCCTCATCTACTTCCTCTTCACCCGCAA 
AAACCCCTACCGCTTCCTGTGGGGCATCGTGACGCCGCTGGCCACTGCCTTTGGGACCTCTTCCAGT 
TCCGCCACGCTGCCGCTGATGATGAAGTGCGTGGAGGAGAATAATGGCGTGGCCAAGCACATCAGCC 
GTTTCATCCTGCCCATCGGCGCCACCGTCAACATGGACGGTGCCGCGCTCTTCCAGTGCGTGGCCGC 
AGTGTTCATTGCACAGCTCAGCCAGCAGTCCTTGGACTTCGTAAAGATCATCACCATCCTGGTCACG 
GCCACAGCGTCCAGCGTGGGGGCAGCGGGCATCCCTGCTGGAGGTGTCCTCACTCTGGCCATCATCC 
TCGAAGCAGTCAACCTCCCGGTCGACCATATCTCCTTGATCCTGGCTGTGGACTGGCTAGTCGACCG 
GTCCTGTACCGTCCTCAATGTAGAAGGTGACGCTCTGGGGGCAGGACTCCTCCAAAATTACGTGGAC 
CGTACGGAGTCGAGAAGCACAGAGCCTGAGTTGATACAAGTGAAGAGTGAGCTGCCCCTGGATCCGC 
TGCCAGTCCCCACTGAGGAAGGAAACCCCCTCCTCAAACACTATCGGGGGCCCGCAGGGGATGCCAC 
GGTCGCCTCTGAG-AAGGAATCAGTCATGTA AGCGGCCGCTCGAGTCTAGAGGGCCCGTTTAAACCCG 



CTGATCAGCCTCGACTGTGCCTTCTAGTTGCCAGCCATCTGTTGTTTGCCCCTCCCCCGTGCCTTC 



TTOACCCTGGAAGGTGCCACTCCCACTGTCCTTTCCTAATAAAATGAGGAAATTGCATCGCATTGTC 



TGAGTAG 



ORF Start: at 134 | ]ORF Stop: TAA at 1838 





SEQ ID NO: 248 |568aa 


MW at 59557.8kD 


NOV55b, 
CG96736-02 
Protein Sequence 


GDPSWLAFKIjKI/STELGSTS PVWVWSTMVADPPIUDSKGLAAAEPTANGGLALAS 
SRDQVRRCLRANLLVLLTWAWAGVALGLGVSGAGGALALGPERLSAFVF PGELLLRLLRMI I LPL 
WC SL IGGAASLDPGALGRLGAWALLPPLVTTLLASAIiGVGLAIjALQ PGAASAA imas vg aag saen 
APSKEVLDSFLDLARNIFPSNLVSAAFRSYSTTYEERNITGTR^ 

gvalrklgpegellirffnsfiieatmvlvswi^^ 
cllghaihgllvlpliyflftrknpyrflwgi 

rf i lp igatvnmdgaalfqc vaavf iaqlsqqs ldfvk i itilvtatassvgaagipaggvltlai i 

leavl^pvdhislilaw^vdrsctvuwegd^^ 

lpvpteegnpllkhyrgpagdatvasekesvm 



Sequence comparison of the above protein sequences yields the following sequence 
relationships shown in Table 55B. 



Table 55B. Comparison of NOV55a against NOV55b. 


Protein Sequence 


NOV55a Residues/ 
Match Residues 


Identities/ 

Similarities for the Matched Region 


NOV55b 


1..541 
28..S68 


423/541 (78%) 
423/541 (78%) 



Further analysis of the NOV55a protein yielded the following properties shown in 
15 Table 55C. 



Table 55C. Protein Sequence Properties NOV55a 


PSort analysis: 


0.6000 probability located in plasma membrane; 0.4000 probability located in 
Golgi body; 0.3000 probability located in endoplasmic reticulum (membrane); 
0.3000 probability located in microbody (peroxisome) 
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SignalP analysis: Cleavage site between residues 70 and 71 



A search of the NOV55a protein against the Geneseq database, a proprietary 
database that contains sequences published in patents and patent publication, yielded 
several homologous proteins shown in Table 55D. 



Table 55D. Geneseq Results for NOV55a 


Geneseq 
Identifier 


Protein/Organism/Length 
[Patent #, Date] 


NOV55a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for 
the Matched 
Region 


Expect 
Value 


ABG61858 


Prostate cancer-associated 
protein #59 - Mammalia, 541 
aa. [WO200230268-A2, 
18-APR-2002] 


1..541 
1..541 


531/541 (98%) 
531/541 (98%) 


0.0 


AAR95044 


Apoptosis participating 
protein - Homo sapiens, 514 
aa. [JP08089257-A, 
09-APR-1996] 


1..513 
1..513 


499/513 (97%) 
499/513 (97%) 


0.0 


AAY78144 


Human neutral amino acid 
transporter ASCT1 - Homo 
sapiens, 532 aa. 
[US6020479-A, 
01-FEB-2000] 


32..541 
21..532 


314/521 (60%) 
378/521 (72%) 


e-161 


AAY99961 


Human amino acid 
transporter ASCT1 protein - 
Homo sapiens, 532 aa. 
[US6074828-A, 
13-JUN-2000] 


32..541 
21..532 


314/521 (60%) 
378/521 (72%) 


e-161 


AAY97139 


ASCT1 human neutral amino 
acid transporter protein - 
Homo sapiens, 532 aa. 
[US6100085-A, 
08-AUG-2000] 


32..541 
21..532 


314/521 (60%) 
378/521 (72%) 


e-161 



In a BLAST search of public sequence datbases, the NOV55a protein was found to 
have homology to the proteins shown in the BLASTP data in Table 55E. 



Table 55E. Public BLASTP Results for NOV55a 
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CT/UOOB 
Identities/ 

Similarities for 

the Matched 

Portion 



■31373 

Expect 
Value 



Protein 

Accession 

Number 



Protein/Organism/Length 



NOV55a 
Residues/ 
Match 
Residues 



AAD09814 



Neutral amino acid 
transporter - Homo sapiens 
(Human), 541 aa. 



1..541 
1..541 



532/541 (98%) 
532/541 (98%) 



0.0 



Q15758 



Neutral amino acid 
transporter B(0) (ATB(O)) 
(Sodium-dependent neutral 
amino acid transporter type 
2) (RD114/simian type D 
retrovirus receptor) (Baboon 
M7 virus receptor) - Homo 
sapiens (Human), 541 aa. 



1..541 
1..541 



531/541 (98%) 
531/541 (98%) 



0.0 



019105 



Neutral amino acid 
transporter B(0) (ATB(0)) 
(Sodium-dependent neutral 
amino acid transporter type 
2) - Oryctolagus cuniculus 
(Rabbit), 541 aa. 



L.541 
1..541 



459/542 (84%) 
485/542(88%) 



0.0 



Q95JC7 



Neutral amino acid 
transporter B(0) (ATB(0)) 
(Sodium-dependent neutral 
amino acid transporter type 
2) - Bos taurus (Bovine), 539 
aa. 



L.541 
1..539 



465/542 (85%) 
486/542 (88%) 



0.0 



AAM94351 



Na+-dependent amino acid 
transporter ASCT2 - Rattus 
norvegicus (Rat), 551 aa. 



L.541 
1..551 



445/553 (80%) 
471/553 (84%) 



0.0 



PFam analysis predicts that the NOV55a protein contains the domains shown in the 
Table 55F. 

5 



Table 55F. Domain Analysis of NOV55a 


Pfam Domain 


NOV55a Match Region 


Identities/ 
Similarities 

for the Matched Region 


Expect Value 


SDF 


54..48S 


195/465 (42%) 
373/465 (80%) 


L5e-178 
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Example B: Sequencing Methodology and 

1. GeneCalling™ Technology: This is a proprietary method of performing 
differential gene expression profiling between two or more samples developed at CuraGen 
and described by Shimkets, et al., "Gene expression analysis by transcript profiling coupled 

5 to a gene database query" Nature Biotechnology 17:198-803 (1999). cDNA was derived 
from various human samples representing multiple tissue types, normal and diseased states, 
physiological states, and developmental states from different donors. Samples were 
obtained as whole tissue, primary cells or tissue cultured primary cells or cell lines. Cells 
and cell lines may have been treated with biological or chemical agents that regulate gene 

10 expression, for example, growth factors, chemokines or steroids. The cDNA thus derived 
was then digested with up to as many as 120 pairs of restriction enzymes and pairs of 
linker-adaptors specific for each pair of restriction enzymes were ligated to the appropriate 
end. The restriction digestion generates a mixture of unique cDNA gene fragments. 
Limited PCR amplification is performed with primers homologous to the linker adapter 

15 sequence where one primer is biotinylated and the other is fluorescently labeled. The 

doubly labeled material is isolated and the fluorescently labeled single strand is resolved by 
capillary gel electrophoresis. A computer algorithm compares the electropherograms from 
an experimental and control group for each of the restriction digestions. This and additional 
sequence-derived information is used to predict the identity of each differentially expressed 

20 gene fragment using a variety of genetic databases. The identity of the gene fragment is 
confirmed by additional, gene-specific competitive PCR or by isolation and sequencing of 
the gene fragment. 

2. SeqCalling™ Technology: cDNA was derived from various human 
samples representing multiple tissue types, normal and diseased states, physiological states, 

25 and developmental states from different donors. Samples were obtained as whole tissue, 
primary cells or tissue cultured primary cells or cell lines. Cells and cell lines may have 
been treated with biological or chemical agents that regulate gene expression, for example, 
growth factors, chemokines or steroids. The cDNA thus derived was then sequenced using 
CuraGen's proprietary SeqCalling technology. Sequence traces were evaluated manually 

30 and edited for corrections if appropriate. cDNA sequences from all samples were 

assembled together, sometimes including public human sequences, using bioinformatic 
programs to produce a consensus sequence for each assembly. Each assembly is included in 
CuraGen Corporation's database. Sequences were included as components for assembly 
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when the extent of identity with another component was ft ^Sh^t^^^^&^pM^Yr'' 
assembly represents a gene or portion thereof and includes information on variants, such as 
splice forms single nucleotide polymorphisms (SNPs), insertions, deletions and other 
sequence variations. 

5 3. PathCalling™ Technology: The NOVX nucleic acid sequences are 

derived by laboratory screening of cDNA library by the two-hybrid approach. cDNA 
fragments covering either the full length of the DNA sequence, or part of the sequence, or 
both, are sequenced. In silico prediction was based on sequences available in CuraGen 
Corporation's proprietary sequence databases or in the public human sequence databases, 

10 and provided either the full length DNA sequence, or some portion thereof. 

The laboratory screening was performed using the methods summarized below: 
cDNA libraries were derived from various human samples representing multiple 
tissue types, normal and diseased states, physiological states, and developmental states 
from different donors. Samples were obtained as whole tissue, primary cells or tissue 

15 cultured primary cells or cell lines. Cells and cell lines may have been treated with 

biological or chemical agents that regulate gene expression, for example, growth factors, 
chemokines or steroids. The cDNA thus derived was then directionally cloned into the 
appropriate two-hybrid vector (Gal4-activation domain (Gal4-AD) fusion). Such cDNA 
libraries as well as commercially available cDNA libraries from Clontech (Palo Alto, CA) 

20 were then transferred from Exoli into a CuraGen Corporation proprietary yeast strain 
(disclosed in U. S. Patents 6,057,101 and 6,083,693, incorporated herein by reference in 
their entireties). 

Gal4-binding domain (Gal4-BD) fusions of a CuraGen Corportion proprietary 
library of human sequences was used to screen multiple Gal4-AD fusion cDNA libraries 

25 resulting in the selection of yeast hybrid diploids in each of which the Gal4-AD fusion 
contains an individual cDNA. Each sample was amplified using the polymerase chain 
reaction (PCR) using non-specific primers at the cDNA insert boundaries. Such PCR 
product was sequenced; sequence traces were evaluated manually and edited for 
corrections if appropriate. cDNA sequences from all samples were assembled together, 

30 sometimes including public human sequences, using bioinformatic programs to produce a 
consensus sequence for each assembly. Each assembly is included in CuraGen 
Corporation's database. Sequences were included as components for assembly when the 
extent of identity with another component was at least 95% over 50 bp. Each assembly 
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represents a gene or portion thereof and includes informatiofTon ^arl^^SM as^pficie^ ^ * 
forms single nucleotide polymorphisms (SNPs), insertions, deletions and other sequence 
variations. 

Physical clone: the cDNA fragment derived by the screening procedure, covering 
5 the entire open reading frame is, as a recombinant DNA, cloned into pACT2 plasmid 
(Clontech) used to make the cDNA library. The recombinant plasmid is inserted into the 
host and selected by the yeast hybrid diploid generated during the screening procedure by 
the mating of both CuraGen Corporation proprietary yeast strains N106' and YULH (U. $. 
Patents 6,057,101 and 6,083,693). 

10 4. RACE: Techniques based on the polymerase chain reaction such as rapid 

amplification of cDNA ends (RACE), were used to isolate or complete the predicted 
sequence of the cDNA of the invention. Usually multiple clones were sequenced from one 
or more human samples to derive the sequences for fragments. Various human tissue 
samples from different donors were used for the RACE reaction. The sequences derived 

15 from these procedures were included in the SeqCalling Assembly process described in 
preceding paragraphs. 

5. Exon Linking: The NOVX target sequences identified in the present 
invention were subjected to the exon linking process to confirm the sequence. PCR 
primers were designed by starting at the most upstream sequence available, for the forward 

20 primer, and at the most downstream sequence available for the reverse primer. In each 
case, the sequence was examined, walking inward from the respective termini toward the 
coding sequence, until a suitable sequence that is either unique or highly selective was 
encountered, or, in the case of the reverse primer, until the stop codon was reached. Such 
primers were designed based on in silico predictions for the full length cDNA, part (one or 

25 more exons) of the DNA or protein sequence of the target sequence, or by translated 

homology of the predicted exons to closely related human sequences from other species. 
These primers were then employed in PCR amplification based on the following pool of 
human cDNAs: adrenal gland, bone marrow, brain - amygdala, brain - cerebellum, brain - 
hippocampus, brain - substantia nigra, brain - thalamus, brain -whole, fetal brain, fetal 

30 kidney, fetal liver, fetal lung, heart, kidney, lymphoma - Raji, mammary gland, pancreas, 
pituitary gland, placenta, prostate, salivary gland, skeletal muscle, small intestine, spinal 
cord, spleen, stomach, testis, thyroid, trachea, uterus. Usually the resulting amplicons were 
gel purified, cloned and sequenced to high redundancy. The PCR product derived from 
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exon linking was cloned into the pCR2.1 vector from Invatrogen/ Tfie^esi&ISig bacfeiiai * 
clone has an insert covering the entire open reading frame cloned into the pCR2.1 vector. 
The resulting sequences from all clones were assembled with themselves, with other 
fragments in CuraGen Corporation's database and with public ESTs. Fragments and ESTs 
5 were included as components for an assembly when the extent of their identity with another 
component of the assembly was at least 95% over 50 bp. In addition, sequence traces were 
evaluated manually and edited for corrections if appropriate. These procedures provide the 
sequence reported herein. 

6. Physical Clone: Exons were predicted by homology arid the intron/exon 
10 boundaries were determined using standard genetic rules. Exons were further selected and 
refined by means of similarity determination using multiple BLAST (for example, tBlastN, 
BlastX, and BlastN) searches, and, in some instances, GeneScan and Grail. Expressed 
sequences from both public and proprietary databases were also added when available to 
further define and complete the gene sequence. The DNA sequence was then manually 
15 corrected for apparent inconsistencies thereby obtaining the sequences encoding the 
full-length protein. 

The PGR product derived by exon linking, covering the entire open reading frame, 
was cloned into the pCR2.1 vector from Invitrogen to provide clones used for expression 
and screening purposes. 

20 Example C: Quantitative expression analysis of clones in various cells and tissues 

The quantitative expression of various clones was assessed using microtiter plates 
containing RNA samples from a variety of normal and pathology-derived cells, cell lines 
and tissues using real time quantitative PCR (RTQ PCR). RTQ PCR was performed on an 
Applied Biosystems ABI PRISM® 7700 or an ABI PRISM® 7900 HT Sequence Detection 

25 System. Various collections of samples are assembled on the plates, and referred to as 
Panel 1 (containing normal tissues and cancer cell lines), Panel 2 (containing samples 
derived from tissues from normal and cancer sources), Panel 3 (containing cancer cell 
lines), Panel 4 (containing cells and cell lines from normal tissues and cells related to 
inflammatory conditions), Panel 5D/5I (containing human tissues and cell lines with an 

30 emphasis on metabolic diseases), AI_comprehensive_panel (containing normal tissue and 
samples from autoinflammatory diseases), Panel CNSD.01 (containing samples from 
normal and diseased brains) and CNS_neurodegeneration_panel (containing samples from 
normal and Alzheimer's diseased brains). 
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RNA integrity from all samples is controlled for quality 'by vTsuarassessmeht'oT' u * 
agarose gel electropherograms using 28S and 18S ribosomal RNA staining intensity ratio 
as a guide (2:1 to 2.5:1 28s:18s) and the absence of low molecular weight RNAs that would 
be indicative of degradation products. Samples are controlled against genomic DNA 

5 contamination by RTQ PCR reactions run in the absence of reverse transcriptase using 
probe and primer sets designed to amplify across the span of a single ex on. 

First, the RNA samples were normalized to reference nucleic acids such as 
constitutively expressed genes (for example, p-actin and GAPDH). Normalized RNA (5 ul) 
was converted to cDNA and analyzed by RTQ-PCR using One Step RT-PCR Master Mix 

10 Reagents (Applied Biosystems; Catalog No. 4309169) and gene-specific primers according 
to the manufacturers instructions. 

In other cases, non-normalized RNA samples were converted to single strand cDNA 
(sscDNA) using Superscript II (Invitrogen Corporation; Catalog No. 18064-147) and 
random hexamers according to the manufacturer's instructions. Reactions containing up to 

15 10 /ig of total RNA were performed in a volume of 20 jtil and incubated for 60 minutes at 
42°C. This reaction can be scaled up to 50 fig of total RNA in a final volume of 100 fil. 
sscDNA samples are then normalized to reference nucleic acids as described previously, 
using IX TaqMan® Universal Master mix (Applied Biosystems; catalog No. 4324020), 
following the manufacturer's instructions. 

20 Probes and primers were designed for each assay according to Applied Biosystems 

Primer Express Software package (version I for Apple Computer's Macintosh Power PC) or 
a similar algorithm using the target sequence as input. Default settings were used for 
reaction conditions and the following parameters were set before selecting primers: primer 
concentration = 250 nM, primer melting temperature (Tm) range = 58°-60°C, primer 

25 optimal Tm = 59°C, maximum primer difference = 2°C, probe does not have 5*G, probe Tm 
must be 10°C greater than primer Tm, amplicon size 75bp to lOObp. The probes and 
primers selected (see below) were synthesized by Synthegen (Houston, TX, USA). Probes 
were double purified by HPLC to remove uncoupled dye and evaluated by mass 
spectroscopy to verify coupling of reporter and quencher dyes to the 5' and 3' ends of the 

30 probe, respectively. Their final concentrations were: forward and reverse primers, 900nM 
each, and probe, 200nM. 

PCR conditions: When working with RNA samples, normalized RNA from each 
tissue and each cell line was spotted in each well of either a 96 well or a 384-well PCR 
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plate (Applied Biosystems). PCR cocktails included eitheF asingle gen^specific {tfdfre'auild * 
primers set, or two multiplexed probe and primers sets (a set specific for the target clone 
and another gene-specific set multiplexed with the target probe). PCR reactions were set up 
using TaqMan® One-Step RT-PCR Master Mix (Applied Biosystems, Catalog No. 
5 4313803) following manufacturer's instructions. Reverse transcription was performed at 
48°C for 30 minutes followed by amplification/PCR cycles as follows: 95°C 10 min, then 
40 cycles of 95°C for 15 seconds, 60°C for 1 minute. Results were recorded as CT values 
(cycle at which a given sample crosses a threshold level of fluorescence) using a log scale, 
with the difference in RNA concentration between a given sample and the sample with the 
10 lowest CT value being represented as 2 to the power of delta CT. The percent relative 

expression is then obtained by taking the reciprocal of this RNA difference and multiplying 
by 100. 

When working with sscDNA samples, normalized sscDNA was used as described 
previously for RNA samples. PCR reactions containing one or two sets of probe and 
15 primers were set up as described previously, using IX TaqMan® Universal Master mix 
(Applied Biosystems; catalog No. 4324020), following the manufacturer's instructions. 
PCR amplification was performed as follows: 95°C 10 min, then 40 cycles of 95°C for 15 
seconds, 60°C for 1 minute. Results were analyzed and processed as described previously. 

Panels 1, 1.1, 1.2, and 1.3D 

20 The plates for Panels 1 , 1 . 1 , 1 .2 and 1 .3D include 2 control wells (genomic DNA 

control and chemistry control) and 94 wells containing cDNA from various samples. The 
samples in these panels are broken into 2 classes: samples derived from cultured cell lines 
and samples derived from primary normal tissues. The cell lines are derived from cancers 
of the following types: lung cancer, breast cancer, melanoma, colon cancer, prostate cancer, 

25 CNS cancer, squamous cell carcinoma, ovarian cancer, liver cancer, renal cancer, gastric 
cancer and pancreatic cancer. Cell lines used in these panels are widely available through 
the American Type Culture Collection (ATCC), a repository for cultured cell lines, and 
were cultured using the conditions recommended by the ATCC. The normal tissues found 
on these panels are comprised of samples derived from all major organ systems from single 

30 adult individuals or fetuses. These samples are derived from the following organs: adult 
skeletal muscle, fetal skeletal muscle, adult heart, fetal heart, adult kidney, fetal kidney, 
adult liver, fetal liver, adult lung, fetal lung, various regions of the brain, the spleen, bone 
marrow, lymph node, pancreas, salivary gland, pituitary gland, adrenal gland, spinal cord, 
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thymus, stomach, small intestine, colon, bladder, tracheafbrkasf, ivtt^Mr^pT^^if^^ ' 3 
prostate, testis and adipose. 

In the results for Panels 1, LI, 1.2 and 1.3D, the following abbreviations are used: 
ca. = carcinoma, 
5 * = established from metastasis, 

met = metastasis, 
s cell var = small cell variant, 
non-s = non-sm = non-small, 
squam = squamous, 
10 pi. eff = pi effusion = pleural effusion, 

glio = glioma, 
astro = astrocytoma, and 
neuro = neuroblastoma. 

GeneraLscreening_paneLvl.4, vl.5 and vl.6 

15 The plates for Panels 1.4, 1 .5, and 1.6 include 2 control wells (genomic DNA 

control and chemistry control) and 94 wells containing cDNA from various samples. The 
samples in Panels 1.4, 1.5, and 1.6 are broken into 2 classes: samples derived from cultured 
cell lines and samples derived from primary normal tissues. The cell lines are derived from 
cancers of the following types: lung cancer, breast cancer, melanoma, colon cancer, 

20 prostate cancer, CNS cancer, squamous cell carcinoma, ovarian cancer, liver cancer, renal 
cancer, gastric cancer and pancreatic cancer. Cell lines used in Panels 1.4, 1.5, and 1.6 are 
widely available through the American Type Culture Collection (ATCC), a repository for 
cultured cell lines, and were cultured using the conditions recommended by the ATCC. The 
normal tissues found on Panels 1.4, 1.5, and 1.6 are comprised of pools of samples derived 

25 from all major organ systems from 2 to 5 different adult individuals or fetuses. These 

samples are derived from the following organs: adult skeletal muscle, fetal skeletal muscle, 
adult heart, fetal heart, adult kidney, fetal kidney, adult liver, fetal liver, adult lung, fetal 
lung, various regions of the brain, the spleen, bone marrow, lymph node, pancreas, salivary 
gland, pituitary gland, adrenal gland, spinal cord, thymus, stomach, small intestine, colon, 

30 bladder, trachea, breast, ovary, uterus, placenta, prostate, testis and adipose. Abbreviations 
are as described for Panels 1, 1.1, 1.2, and 1.3D. 
Panels 2D, 2.2, 2.3 and 2.4 



326 



WO 03/029424 



PCT/US02/31373 



The plates for Panels 2D, 2.2, 2.3 and 2.4 generally Sfcl&de IWiffifl y^dfc&f " 
test samples composed of RNA or cDNA isolated from human tissue procured by surgeons 
working in close cooperation with the National Cancer Institute's Cooperative Human 
Tissue Network (CHTN) or the National Disease Research Initiative (NDRI) or from 
5 Ardais or Clinomics). The tissues are derived from human malignancies and in cases where 
indicated many malignant tissues have "matched margins" obtained from noncancerous 
tissue just adjacent to the tumor. These are termed normal adjacent tissues and are denoted 
"NAT" in the results below. The tumor tissue and the "matched margins" are evaluated by 
two independent pathologists (the surgical pathologists and again by a pathologist at NDRI/ 

10 CHTN/Ardais/Clinomics). Unmatched RNA samples from tissues without malignancy 
(normal tissues) were also obtained from Ardais or Clinomics. This analysis provides a 
gross histopathological assessment of tumor differentiation grade. Moreover, most samples 
include the original surgical pathology report that provides information regarding the 
clinical stage of the patient. These matched margins are taken from the tissue surrounding 

15 (i.e. immediately proximal) to the zone of surgery (designated "NAT*, for normal adjacent 
tissue, in Table RR). In addition, RNA and cDNA samples were obtained from various 
human tissues derived from autopsies performed on eldeiiy people or s,udden death victims 
(accidents, etc.). These tissues were ascertained to be free of disease and were purchased 
from various commercial sources such as Clontech (Palo Alto, CA), Research Genetics, 

20 and Invitrogen. 

HASS Panel v 1.0 

The HASS panel v 1.0 plates are comprised of 93 cDNA samples and two controls. 
Specifically, 81 of these samples are derived from cultured human cancer cell lines that had 
been subjected to serum starvation, acidosis and anoxia for different time periods as well as 

25 controls for these treatments, 3 samples of human primary cells, 9 samples of malignant 
brain cancer (4 medulloblastomas and 5 glioblastomas) and 2 controls. The human cancer 
cell lines are obtained from ATCC (American Type Culture Collection) and fall into the 
following tissue groups: breast cancer, prostate cancer, bladder carcinomas, pancreatic 
cancers and CNS cancer cell lines. These cancer cells are all cultured under standard 

30 recommended conditions. The treatments used (serum starvation, acidosis and anoxia) have 
been previously published in the scientific literature. The primary human cells were 
obtained from Clonetics (Walkersville, MD) and were grown in the media and conditions 
recommended by Clonetics. The malignant brain cancer samples are obtained as part of a 
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collaboration (Henry Ford Cancer Center) and are evaluafed^y a paM^gfsf^fioFto'* 3 A 
CuraGen receiving the samples . RNA was prepared from these samples using the standard 
procedures. The genomic and chemistry control wells have been described previously. 

ARDAIS Panel v 1.0 

5 The plates for ARDAIS panel v 1.0 generally include 2 control wells and 22 test 

samples composed of RNA isolated from human tissue procured by surgeons working in 
close cooperation with Ardais Corporation. The tissues are derived from human lung 
malignancies (lung adenocarcinoma or lung squamous cell carcinoma) and in cases where 
indicated many malignant samples have "matched margins" obtained from noncancerous 

10 lung tissue just adjacent to the tumor. These matched margins are taken from the tissue 
surrounding (i.e. immediately proximal) to the zone of surgery (designated "NAT", for 
normal adjacent tissue) in the results below. The tumor tissue and the "matched margins" 
are evaluated by independent pathologists (the surgical pathologists and again by a 
pathologist at Ardais). Unmatched malignant and non-malignant RNA samples from lungs 

15 were also obtained from Ardais. Additional information from Ardais provides a gross 
histopathological assessment of tumor differentiation grade and stage. Moreover, most 
samples include the original surgical pathology report that provides information regarding 
the clinical state of the patient. 

Panel 3D, 3.1 and 3,2 

20 The plates of Panel 3D, 3.1, and 3.2 are comprised of 94 cDNA samples and two 

control samples. Specifically, 92 of these samples are derived from cultured human cancer 
cell lines, 2 samples of human primary cerebellar tissue and 2 controls. The human cell 
lines are generally obtained from ATCC (American Type Culture Collection), NCI or the 
German tumor cell bank and fall into the following tissue groups: Squamous cell carcinoma 

25 of the tongue, breast cancer, prostate cancer, melanoma, epidermoid carcinoma, sarcomas, 
bladder carcinomas, pancreatic cancers, kidney cancers, leukemias/lymphomas, 
ovarian/uterine/cervical, gastric, colon, lung and CNS cancer cell lines. In addition, there 
are two independent samples of cerebellum. These cells are all cultured under standard 
recommended conditions and RNA extracted using the standard procedures. The cell lines 

30 in panel 3D, 3.1, 3.2, 1, 1.1., 1.2, 1.3D, 1.4, 1.5, and 1.6 are of the most common cell lines 
used in the scientific literature. 
Panels 4D, 4R, and 4.1D 
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Panel 4 includes samples on a 96 well plate (2 controf wefis,^ test 'samplesf ' 
composed of RNA (Panel 4R) or cDNA (Panels 4D/4.1D) isolated from various human cell 
lines or tissues related to inflammatory conditions. Total RNA from control normal tissues 
such as colon and lung (Stratagene, La Jolla, CA) and thymus and kidney (Clontech) was 
5 employed. Total RNA from liver tissue from cirrhosis patients and kidney from lupus 
patients was obtained from BioChain (Biochain Institute, Inc., Hayward, CA). Intestinal 
tissue for RNA preparation from patients diagnosed as having Crohn's disease and 
ulcerative colitis was obtained from the National Disease Research Interchange (NDRI) 
(Philadelphia, PA). 

10 Astrocytes, lung fibroblasts, dermal fibroblasts, coronary artery smooth muscle 

cells, small airway epithelium, bronchial epithelium, microvascular dermal endothelial 
cells, microvascular lung endothelial cells, human pulmonary aortic endothelial cells, 
human umbilical vein endothelial cells were all purchased from Clonetics (Walkersville, 
MD) and grown in the media supplied for these cell types by Clonetics. These primary cell 

15 types were activated with various cytokines or combinations of cytokines for 6 and/or 
12-14 hours, as indicated. The following cytokines were used; IL-1 beta at approximately 
l-5ng/ml, TNF alpha at approximately 5-10ng/ml, IFN gamma at approximately 
20-50ng/ml, IL-4 at approximately 5-10ng/ml, 11^9 at approximately 5-10ng/ml, EL-13 at 
approximately 5-10ng/ml. Endothelial cells were sometimes starved for various times by 

20 culture in the basal media from Clonetics with 0.1 % serum. 

Mononuclear cells were prepared from blood of employees at CuraGen 
Corporation, using Ficoll. LAK cells were prepared from these cells by culture in DMEM 
5% FCS (Hyclone), 100/iM non essential amino acids (GibcoflLife Technologies, 
Rockville, MD), ImM sodium pyruvate (Gibco), mercaptoethanol 5.5x10"^ (Gibco), and 

25 lOmM Hepes (Gibco) and Interleukin 2 for 4-6 days. Cells were then either activated with 
10-20ng/ml PMA and l-2fig/ml ionomycin, IL-12 at 5-10ng/ml, IFN gamma at 20-50ng/ml 
andIL-18 at 5-10ng/ml for 6 hours. In some cases, mononuclear cells were cultured for 4-5 
days in DMEM 5% FCS (Hyclone), 100/iM non essential amino acids (Gibco), ImM 
sodium pyruvate (Gibco), mercaptoethanol 5.5xl0' 5 M (Gibco), and lOmM Hepes (Gibco) 

30 with PHA (phytohemagglutinin) or PWM (pokeweed mitogen) at approximately 5jig/ml. 
Samples were taken at 24, 48 and 72 hours for RNA preparation. MLR (mixed lymphocyte 
reaction) samples were obtained by taking blood from two donors, isolating the 
mononuclear cells using Ficoll and mixing the isolated mononuclear cells 1:1 at a final 
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concentration of approximately 2xl0 6 cells/ml in DMEM !>9^Sf(flyc^ 
essentia] amino acids (Gibco), ImM sodium pyruvate (Gibco), mercaptoethanol 
(5.5xlO~ 5 M) (Gibco), and lOmM Hepes (Gibco). The MLR was cultured and samples taken 
at various time points ranging from 1-7 days for RNA preparation. 

5 Monocytes were isolated from mononuclear cells using CD 14 Miltenyi Beads, +ve 

VS selection columns and a Vario Magnet according to the manufacturer's instructions. 
Monocytes were differentiated into dendritic cells by culture in DMEM 5% fetal calf serum 
(FCS) (Hyclone, Logan, UT), lOO^iM non essential amino acids (Gibco), ImM sodium 
pyruvate (Gibco), mercaptoethanol 5.5xlO <5 M (Gibco), and lOmM Hepes (Gibco), 50ng/ml 

10 GMCSF and Sngfrnl IL-4 for 5-7 days. Macrophages were prepared by culture of 

monocytes for 5-7 days in DMEM 5% FCS (Hyclone), IOOjiM non essential amino acids 
(Gibco), ImM sodium pyruvate (Gibco), mercaptoethanol 5.5xl0 _5 M (Gibco), lOmM 
Hepes (Gibco) and 10% AB Human Serum or MCSF at approximately 50ng/ml. 
Monocytes, macrophages and dendritic cells were stimulated for 6 and 12-14 hours with 

15 lipopolysaccharide (LPS) at lOOng/ml. Dendritic cells were also stimulated with anti-CD40 
monoclonal antibody (Pharmingen) at lOjig/ml for 6 and 12-14 hours. 

CD4 lymphocytes, CD8 lymphocytes and NK cells were also isolated from 
mononuclear cells using CD4, CD8 and CD56 Miltenyi beads, positive VS selection 
columns and a Vario Magnet according to the manufacturer's instructions. CD45RA and 

20 CD45RO CD4 lymphocytes were isolated by depleting mononuclear cells of CD8, CD56, 
CD14 and CD19 cells using CD8, CD56, CD14 and CD19 Miltenyi beads and positive 
selection. CD45RO beads were then used to isolate the CD45RO CD4 lymphocytes with 
the remaining cells being CD45RA CD4 lymphocytes. CD45RA CD4, CD45RO CD4 and 
CD8 lymphocytes were placed in DMEM 5% FCS (Hyclone), 100/iM non essential amino 

25 acids (Gibco), ImM sodium pyruvate (Gibco), mercaptoethanol 5.5xl0" 5 M (Gibco), and 
lOmM Hepes (Gibco) and plated at 10 6 cells/ml onto Falcon 6 well tissue culture plates that 
had been coated overnight with 0.5jig/ml anti-CD28 (Pharmingen) and 3ug/ml anti-CD3 
(OKT3, ATCC) in PBS. After 6 and 24 hours, the cells were harvested for RNA 
preparation. To prepare chronically activated CD8 lymphocytes, we activated the isolated 

30 CD8 lymphocytes for 4 days on anti-CD28 and anti-CD3 coated plates and then harvested 
the cells and expanded them in DMEM 5% FCS (Hyclone), 100/iM non essential amino 
acids (Gibco), ImM sodium pyruvate (Gibco), mercaptoethanol 5.5xlO' 5 M (Gibco), and 
lOmM Hepes (Gibco) and IL-2. The expanded CD8 cells were then activated again with 
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plate bound anti-CD3 and anti-CD28 for 4 days and expanded as before: RNA was isolated ~* 
6 and 24 hours after the second activation and after 4 days of the second expansion culture. 
The isolated NK cells were cultured in DMEM 5% FCS (Hyclone), 100/iM non essential 
amino acids (Gibco), ImM sodium pyruvate (Gibco), mercaptoethanol 5.5x10"^ (Gibco), 
5 and lOmM Hepes (Gibco) and EL-2 for 4-6 days before RNA was prepared. 

To obtain B cells, tonsils were procured from NDRI. The tonsil was cut up with 
sterile dissecting scissors and then passed through a sieve. Tonsil cells were then spun 
down and resupended at 10 6 cells/ml in DMEM 5% FCS (Hyclone), 100/iM non essential 
amino acids (Gibco), ImM sodium pyruvate (Gibco), mercaptoethanol 5.5xl0" 5 M (Gibco), 
10 and lOmM Hepes (Gibco). To activate the cells, we used PWM at 5/ig/ml or anti-CD40 
(Pharmingen) at approximately 10/ig/ml and IL-4 at 5-10ng/ml. Cells were harvested for 
RNA preparation at 24,48 and 72 hours. 

To prepare the primary and secondary Thl/Th2 andTrl cells, six-well Falcon plates 
were coated overnight with 10/ig/ml anti-CD28 (Pharmingen) and2/ig/ml OKT3 (ATCC), 

15 and then washed twice with PBS. Umbilical cord blood CD4 lymphocytes (Poietic 
Systems, German Town, MD) were cultured at 10 5 -10 6 celis/ml in DMEM 5% FCS 
(Hyclone), 100/iM non essential amino acids (Gibco), ImM sodium pyruvate (Gibco), 
mercaptoethanol 5.5xlO' 5 M (Gibco), lOmM Hepes (Gibco) and IL-2 (4ng/ml). EL-12 
(5ng/ml) and anti-IL4 (1/ig/ml) were used to direct to Thl, while EL-4 (5ng/ml) and 

20 anti-lFN gamma (1/ig/ml) were used to direct to Th2 and EL-10 at 5ng/ml was used to 
direct to Trl. After 4-5 days, the activated Thl, Th2 and Trl lymphocytes were washed 
once in DMEM and expanded for 4-7 days in DMEM 5% FCS (Hyclone), 100/iM non 
essential amino acids (Gibco), ImM sodium pyruvate (Gibco), mercaptoethanol 5.5xl0" 5 M 
(Gibco), lOmM Hepes (Gibco) and JL-2 (lng/ml). Following this, the activated Thl, Th2 

25 and Trl lymphocytes were re-stimulated for 5 days with anti-CD28/OKT3 and cytokines as 
described above, but with the addition of anti-CD95L (1/ig/ml) to prevent apoptosis. After 
4-5 days, the Thl, Th2 and Trl lymphocytes were washed and then expanded again with 
IL-2 for 4-7 days. Activated Thl and Th2 lymphocytes were maintained in this way for a 
maximum of three cycles. RNA was prepared from primary and secondary Thl, Th2 and 

30 Trl after 6 and 24 hours following the second and third activations with plate bound 

anti-CD3 and anti-CD28 mAbs and 4 days into the second and third expansion cultures in 
Interleukin 2. 
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The following leukocyte cells lines were obtained from the " 
KU-812. EOL cells were further differentiated by culture in O.lmM dbcAMP at 
Sxltfcells/ml for 8 days, changing the media every 3 days and adjusting the cell 
concentration to 5x 10 5 cells/ml. For the culture of these cells, we used DMEM or RPMI (as 

5 recommended by the ATCC), with the addition of 5% FCS (Hyclone), 100/iM non 

essential amino acids (Gibco), ImM sodium pyruvate (Gibco), mercaptoethanol 5.5xl0" 5 M 
(Gibco), lOmM Hepes (Gibco). RNA was either prepared from resting cells or cells 
activated with PMA at lOng/ml and ionomycin at lptg/ml for 6 and 14 hours. Keratinocyte 
line CCD 106 and an airway epithelial tumor line NCI-H292 were also obtained from the 

10 ATCC. Both were cultured in DMEM 5% FCS (Hyclone), lOOjiM non essential amino 
acids (Gibco), ImM sodium pyruvate (Gibco), mercaptoethanol 5.5xl0" 5 M (Gibco), and 
lOmM Hepes (Gibco). CCD1106 cells were activated for 6 and 14 hours with 
approximately 5 ng/ml TNF alpha and lng/ml IL-1 beta, while NCI-H292 cells were 
activated for 6 and 14 hours with the following cytokines: 5ng/ml DL-4, 5ng/ml EL-9, 

1 5 5ng/ml EL-13 and 25ng/ml IFN gamma. 

For these cell lines and blood cells, RNA was prepared by lysing approximately 
10 7 cells/ml using Trizol (Gibco BRL). Briefly, 1/10 volume of bromochloropropane 
(Molecular Research Corporation) was added to the RNA sample, vortexed and after 10 
minutes at room temperature, the tubes were spun at 14,000 rpm in a Sorvall SS34 rotor. 

20 The aqueous phase was removed and placed in a 15ml Falcon Tube. An equal volume of 
isopropanol was added and left at -20°C overnight. The precipitated RNA was spun down 
at 9,000 rpm for 15 min in a Sorvall SS34 rotor and washed in 70% ethanol. The pellet was 
redissolved in 300/xl of RNAse-free water and 35jd buffer (Promega) 5^1 DTT, 7/xl 
RNAsin and 8/xl DNAse were added. The tube was incubated at 37°C for 30 minutes to 

25 remove contaminating genomic DNA, extracted once with phenol chloroform and 

re-precipitated with 1/10 volume of 3M sodium acetate and 2 volumes of 100% ethanol. 
The RNA was spun down and placed in RNAse free water. RNA was stored at -80°C. 

AI_comprehensrve paneLvl.O 

The plates for AI_comprehensive paneLvl.O include two control wells and 89 test 
30 samples comprised of cDNA isolated from surgical and postmortem human tissues 
obtained from the Backus Hospital and Clinomics (Frederick, MD). Total RNA was 
extracted from tissue samples from the Backus Hospital in the Facility at CuraGen. Total 
RNA from other tissues was obtained from Clinomics. 
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Joint tissues including synovial fluid, synovium, Bone* and cartilage* were o&tunecf ~* 
from patients undergoing total knee or hip replacement surgery at the Backus Hospital. 
Tissue samples were immediately snap frozen in liquid nitrogen to ensure that isolated 
RNA was of optimal quality and not degraded. Additional samples of osteoarthritis and 
5 rheumatoid arthritis joint tissues were obtained from Clinomics. Normal control tissues 
were supplied by Clinomics and were obtained during autopsy of trauma victims. 

Surgical specimens of psoriatic tissues and adjacent matched tissues were provided 
as total RNA by Clinomics. Two male and two female patients were selected between the 
ages of 25 and 47. None of the patients were taking prescription drugs at the time samples 
10 were isolated. 

Surgical specimens of diseased colon from patients with ulcerative colitis and 
Crohns disease and adjacent matched tissues were obtained from Clinomics. Bowel tissue 
from three female and three male Crohn's patients between the ages of 41-69 were used. 
Two patients were not on prescription medication while the others were taking 
15 dexamethasone, phenobarbital, or tylenol. Ulcerative colitis tissue was from three male and 
four female patients. Four of the patients were taking lebvid and two were on 
phenobarbital. 

Total RNA from post mortem lung tissue from trauma victims with no disease or 
with emphysema, asthma or COPD was purchased from Clinomics. Emphysema patients 

20 ranged in age from 40-70 and all were smokers, this age range was chosen to focus on 
patients with cigarette-linked emphysema and to avoid those patients with 
alpha- lanti-trypsin deficiencies. Asthma patients ranged in age from 36-75, and excluded 
smokers to prevent those patients that could also have COPD. COPD patients ranged in age 
from 35-80 and included both smokers and non-smokers. Most patients were taking 

25 corticosteroids, and bronchodilators. 

In the labels employed to identify tissues in the AI_compiehensive paneLvl.O 
panel, the following abbreviations are used: 

AI = Autoimmunity 
Syn = Synovial 
30 Normal = No apparent disease 

Rep22 /Rep20 = individual patients 
RA = Rheumatoid arthritis 
Backus = From Backus Hospital 
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OA = Osteoarthritis 
(SS) (BA) (MF) = Individual patients 
Adj = Adjacent tissue 
Match control = adjacent tissues 
5 -M = Male 

-F = Female 

COPD = Chronic obstructive pulmonary disease 



AI.05 chondrosarcoma 

10 

The AI05 chondrosarcoma plates are comprised of SW1353 cells that had been subjected 
to serum starvation, and treatment with cytokines that are known to induce MMP (1, 3 and 13) 
synthesis (eg. ILlbeta). These treatments include: IL-ip (10 ng/ml), IL-lp + TNF-ct (50 ng/ml), 
IL-1P + Oncostatin (50 ng/ml) and PMA (100 ng/ml). The SW1353 cells were obtained from 

1 5 ATCC (American Type Culture Collection) and were all cultured under standard recommended 
conditions. The SW1353 cells were plated at 3 xlO 5 cells/ml (in DMEM medium-10 % FBS) 
in 6-well plate. The treatment was done in triplicate, for 6 and 18 h. The supernatants were 
collected for analysis of MMP 1, 3 and 13 production and for RNA extraction. RNA was prepared 
from these samples using the standard procedures. 

20 Panels 5D and 51 

The plates for Panel 5D and 51 include two control wells and a variety of cDNAs 
isolated from human tissues and cell lines with an emphasis on metabolic diseases. 
Metabolic tissues were obtained from patients enrolled in the Gestational Diabetes study. 
Cells were obtained during different stages in the differentiation of adipocytes from human 

25 mesenchymal stem cells. Human pancreatic islets were also obtained. 

In the Gestational Diabetes study subjects are young (18-40 years), otherwise 
healthy women with and without gestational diabetes undergoing routine (elective) 
Caesarean section. After delivery of the infant, when the surgical incisions were being 
repaired/closed, the obstetrician removed a small sample (<1 cc) of the exposed metabolic 

30 tissues during the closure of each surgical level. The biopsy material was rinsed in sterile 
saline, blotted and fast frozen within 5 minutes from the time of removal. The tissue was 
then flash frozen in liquid nitrogen and stored, individually, in sterile screw-top tubes and 
kept on dry ice for shipment to or to be picked up by CuraGen. The metabolic tissues of 
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interest include uterine wall (smooth muscle), visceral adip^ 
subcutaneous adipose. Patient descriptions are as follows: 

Patient 2: Diabetic Hispanic, overweight, not on insulin 

Patient 7-9: Nondiabetic Caucasian and obese (BMI>30) 
5 Patient 10: Diabetic Hispanic, overweight, on insulin 

Patient 11: Nondiabetic African American and overweight 

Patient 12: Diabetic Hispanic on insulin 

Adiocyte differentiation was induced in donor progenitor cells obtained from Osirus 
(a division of Clonetics/BioWhittaker) in triplicate, except for Donor 3U which had only 

10 two replicates. Scientists at Clonetics isolated, grew and differentiated human 

mesenchymal stem cells (HuMSCs) for CuraGen based on the published protocol found in 
Mark F. Pittenger, et al., Multilineage Potential of Adult Human Mesenchymal Stem Cells 
Science Apr 2 1999: 143-147. Clonetics provided Trizol lysates or frozen pellets suitable 
for mRNA isolation and ds cDNA production. A general description of each donor is as 

15 follows: 

Donor 2 and 3 U: Mesenchymal Stem cells, Undifferentiated Adipose 
Donor 2 and 3 AM: Adipose, AdiposeMidway Differentiated 
Donor 2 and 3 AD: Adipose, Adipose Differentiated 

Human cell lines were generally obtained from ATCC (American Type Culture 
20 Collection), NCI or the German tumor cell bank and fall into the following tissue groups: 
kidney proximal convoluted tubule, uterine smooth muscle cells, small intestine, liver 
HepG2 cancer cells, heart primary stromal cells, and adrenal cortical adenoma cells. These 
cells are all cultured under standard recommended conditions and RNA extracted using the 
standard procedures. All samples were processed at CuraGen to produce single stranded 
25 cDNA. 

Panel 51 contains all samples previously described with the addition of pancreatic 
islets from a 58 year old female patient obtained from the Diabetes Research Institute at the 
University of Miami School of Medicine. Islet tissue was processed to total RNA at an 
outside source and delivered to CuraGen for addition to panel 51. 
30 In the labels employed to identify tissues in the 5D and 51 panels, the following 

abbreviations are used: 
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GO Adipose = Greater Omentum Adipose 
SK = Skeletal Muscle 
UT = Uterus 
PL = Placenta 
5 AD = Adipose Differentiated 

AM = Adipose Midway Differentiated 
U = Undifferentiated Stem Cells 
Panel CNSD.01 

The plates for Panel CNSD.01 include two control wells and 94 test samples 
10 comprised of cDNA isolated from postmortem human brain tissue obtained from the 
Harvard Brain Tissue Resource Center. Brains are removed from calvaria of donors 
between 4 and 24 hours after death, sectioned by neuroanatomists, and frozen at -80°C in 
liquid nitrogen vapor. All brains are sectioned and examined by neuropathologists to 
confirm diagnoses with clear associated neuropathology. 
15 Disease diagnoses are taken from patient records. The panel contains two brains 

from each of the following diagnoses: Alzheimer's disease, Parkinson's disease, 
Huntington's disease, Progressive Supernuclear Palsy, Depression, and "Normal controls". 
Within each of these brains, the following regions are represented: cingulate gyrus, 
temporal pole, globus palladus, substantia nigra, Brodman Area 4 (primary motor strip), 
20 Brodman Area 7 (parietal cortex), Brodman Area 9 (prefrontal cortex), and Brodman area 
17 (occipital cortex). Not all brain regions are represented in all cases; e.g., Huntington's 
disease is characterized in part by neurodegeneration in the globus palladus, thus this 
region is impossible to obtain from confirmed Huntington's cases. Likewise Parkinson's 
disease is characterized by degeneration of the substantia nigra making this region more 
25 difficult to obtain. Normal control brains were examined for neuropathology and found to 
be free of any pathology consistent with neurodegeneration. 

In the labels employed to identify tissues in the CNS panel, the following 
abbreviations are used: 

PSP = Progressive supranuclear palsy 
30 Sub Nigra = Substantia nigra 

Glob Palladus= Globus palladus 
Temp Pole = Temporal pole 
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Cing Gyr = Cingulate gyrus 
BA 4 = Brodman Area 4 

Panel CNS_Neurodegeneration_V1.0 

The plates for Pane] CNS_Neurodegeneration_Vl .0 include two control wells and 
5 47 test samples comprised of cDNA isolated from postmortem human brain tissue obtained 
from the Harvard Brain Tissue Resource Center (McLean Hospital) and the Human Brain 
and Spinal Fluid Resource Center (VA Greater Los Angeles Healthcare System). Brains are 
removed from calvaria of donors between 4 and 24 hours after death, sectioned by 
neuroanatomists, and frozen at -80°C in liquid nitrogen vapor. All brains are sectioned and 
10 examined by neuropathologists to confirm diagnoses with clear associated neuropathology. 
Disease diagnoses are taken from patient records. The panel contains six brains 
from Alzheimer's disease (AD) patients, and eight brains from "Normal controls" who 
showed no evidence of dementia prior to death. The eight normal control brains are divided 
into two categories: Controls with no dementia and no Alzheimer's like pathology 
15 (Controls) and controls with no dementia but evidence of severe Alzheimer's like 
pathology, (specifically senile plaque load rated as level 3 on a scale of 0-3; 0 = no 
evidence of plaques, 3 = severe AD senile plaque load). Within each of these brains, the 
following regions are represented: hippocampus, temporal cortex (Brodman Area 21), 
parietal cortex (Brodman area 7), and occipital cortex (Brodman area 17). These regions 
20 were chosen to encompass all levels of neurodegeneration in AD. The hippocampus is a 
region of early and severe neuronal loss in AD; the temporal cortex is known to show 
neurodegeneration in AD after the hippocampus; the parietal cortex shows moderate 
neuronal death in the late stages of the disease; the occipital cortex is spared in AD aind 
therefore acts as a "control" region within AD patients. Not all brain regions are 
25 represented in all cases. 

In the labels employed to identify tissues in the CNS_Neurodegeneration_Vl .0 
panel, the following abbreviations are used: 

AD as Alzheimer's disease brain; patient was demented and showed AD-like 
pathology upon autopsy 
30 Control = Control brains; patient not demented, showing no neuropathology 

Control (Path) = Control brains; pateint not demented but showing sever AD-like 
pathology 
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SupTemporal Ctx = Superior Temporal Cortex 
Inf Temporal Ctx = Inferior Temporal Cortex 

A. CG106764-01: RHO/RAC-INTERACTING CITRON KINASE. 

Expression of gene CG106764-01 was assessed using the primer-probe set Ag2100, 
described in Table AA. Results of the RTQ-PCR mns are shown in Tables AB, AC, AD, 
AE, AF, AG, AH and AI. 

Table AA, Probe Name Ag2100 



Primers 


Sequence 


Length 


Start 
Position 


SEQ ID 

No 


Forward 


5 ' -agatccctggaacagaggatt-3 ' 


21 


2446 


249 


Probe 


TET-5 1 -tgtctgaagccaataaacttgcagca 
-3 ' -TAMRA 


26 


2474 


250 


Reverse 


5 ' -ccttcatgttcctttgggtaa-3 1 


21 


2513 


251 



Table AB. AI.05 chondrosarcoma 



Tissue Name 


ReL 

Exp.(%) 

g2100, 

Run 

306913849 


Tissue Name 


ReL 

Exp.(%) 
Ag2100, 
Run 

306913849 


138353JPMA (18hrs) 


9.3 


138346 JL-lbeta + Oncostatin M 
(6hrs) 


64.2 


138352JDL-lbeta + Oncostatin M 
(18hrs) 


5.5 


138345 JL-lbeta+TNFa (6hrs) 


44.8 


138351 JL-lbeta+TNFa (18hrs) 


12.5 


138344JL-lbeta (6hrs) 


25.5 


138350_IL-lbeta (18hrs) 


12.5 


138349 Untreated-serum starved 
(6hrs) 


100.0 


138354_Untreated-complete 
medium (18hrs) 


13.2 


138348JJntreated-complete 
medium (6hrs) 


41.2 


138347 JPMA (6hrs) 


34.9 
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Table AC. AI comprehensive panel vl.O 



Tissue Name 


Rel. 

Exp.(%) 
Ag2100, 
Run 

211059880 


ReL 

Exp.(%) 
Ag2100, 
Run 

212328504 


issue Name 


Rel. 

Exp.(%) 
Ag2100, 
Run 

211059880 


Rel. 

Exp.(%) 
Ag2100, 
Run 

212328504 


1 10967 COPD-F 


0.5 


0.8 


112427 Match Control 
Psoriasis-F j 


2.9 


1.8 


110980 COPD-F 


1.5 


1.2 


112418 Psoriasis-M 


0.8 


0.8 


110968 COPD-M 


0.4 


0.6 


112723 Match Control 
Psoriasis-M 


6.1 


7.4 


110977 COPD-M 


1.5 


1.9 ! 


112419 Psoriasis-M 


1.0 


1.3 


110989 

Emphysema-F 


4.2 


6.0 


112424 Match Control 
Psoriasis-M 


0.4 


1.2 


110992 

Emphysema-F 


2.8 ; 


2.9 


112420 Psoriasis-M 


1.8 


2.4 


110993 

Emphysema-F 


0.9 


0.8 


112425 Match Control 
Psoriasis-M 


2.2 


2.7 


110994 

Emphysema-F 


0.7 


0.4 


104689 (MF) OA 
Bone-Backus 


12.1 


13.2 


110995 

Emphysema-F 


2.0 


5.4 


104690 (MF) Adi 

"Normal" 

Bone-Backus 


5.4 


4.2 


110996 

Emphysema-F 


2.2 


2.4 


104691 (MF)OA 
Synovium-Backus 


43.2 


35.6 


1 10997 Asthma-M 


1.9 


3.1 


104692 (BA) OA 
Cartilage-Backus 


0.9 


0.4 


111001 Asthma-F 


1.4 


2.7 


104694 (BA) OA 
Bone-Backus 


16.8 


16.7 


111002 Asthma-F 


1.0 


1.0 


104695 (BA) Adj 

"Normal" 

Bone-Backus 


6.5 


6.1 


111003 Atopic 
Asthma-F 




Z.Z 


104696 (BA) OA 
Synovium-Backus 


oa n 


OA 1 


111004 Atopic 
Asthma-F 


166 


17.0 


104700 (SS)OA 
Bone-Backus 


12.2 


35.1 


111005 Atopic 
Asthma-F 


7.2 


5.5 


104701 (SS)Adj 

"Normal" 

Bone-Backus 


7.9 


9.5 


111006 Atopic 
Asthma-F 


0.9 


0.7 


104702 (SS) OA 
Synovium-Backus 


8.2 


7.9 


111417 Allergy-M 


1.9 


2.4 


117093 OA Cartilage 
Rep7 


2.0 


2.3 


1 12347 Allergy-M 


0.0 


0.1 


112672 OA Bone5 


1.9 


0.8 


112349 Normal 
Lung-F 


0.0 


0.0 


112673 OA 
Synoviums 


0.3 


1.2 
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112357 Normal 
Lung-F 


ft 1 


6 n 


1 12674 OA Synovial 
Fluid cells5 


r-rra oner 

0.5 


imM mJL .—5 J , . 

04 


112354 Normal 
Lung-M 


i < 
i, j 




117100 OA Cartilage 
Repl4 


0.4 ! 


0.3 


1 12374 Crohns-F 


2.9 


5.2 


112756 OA Bone9 


100,0 


100.0 


112389 Match 
Control Crohns-F 


o n 


f% ft 
u.o 


112757 OA 
Synovium9 


0.5 


0.2 


iizj/j uronns-r 




^ ft 


112758 OA Synovial 
Fluid Cells9 


0 8 

\J.O 


1 5 


112732 Match 
Control Crohns-F 






117125 RA Cartilage 
Rep2 


1 0 


06 


1 12725 Crohns-M 


0.1 


0.7 


113492 Bone2 RA 


2.8 


3.6 


112387 Match 
Control Crohns-M 


1 o 


1 A 


113493 Synovium2 
RA 


1 7 


07 


1 1237o Cronns-M 


u.o 


U.U 


113494 Syn Fluid 
Cells RA 


ft 0 


2 1 


112390 Match 
Control Crohns-M 


2.5 


1.8 


113499 Cartilage4RA 


2.1 


1.8 


1 12726 Crohns-M 


3.8 


5.9 


1 13500 Bone4RA 


1.8 


2.5 


1 17711 Match 
Control Crohns-M 


3.6 


6.7 


113501 Synovium4 
RA 


2.1 


2.3 


112380 Ulcer 


4.9 


4.9 


113502 Syn Fluid 
Cells4 RA 


1.0 


0.8 


112734 Match 
Control Ulcer 
Col-F 


12.6 


12.0 


113495 Cartilage3RA 


2.5 


2.6 


112384 Ulcer 
Col-F 


6.6 


10.2 


1 13496 Bone3RA 


2.0 


2.1 


112737 Match 
Control Ulcer 
Col-F 


4.2 


6.1 


113497 Synovium3 
RA 


1.4 


1.4 


112386 Ulcer 
Col-F 


0.5 


1.2 


113498 Syn Fluid 
Cells3 RA 


2.9 


3.2 


112738 Match 
Control Ulcer 
Col-F 


7.5 


7.9 


117106 Normal 
Cartilage Rep20 


0.1 


0.7 


112381 Ulcer 
Col-M 


0.1 


0.1 


113663 Bone3 Normal 


0.3 


0.1 


112735 Match 
Control Ulcer 
Col-M 


2.9 


2.3 


113664 Synovium3 
Normal 


0.0 


0.0 


112382 Ulcer 
Col-M 


6.7 


8.4 


113665 Syn Fluid 
Cells3 Normal 


0.1 


0.2 


112394 Match 
Control Ulcer 
Col-M 


0.5 


0.5 


117107 Normal 
Cartilage Rep22 


0.9 


0.3 


112383 Ulcer 
Col-M 


12.1 


14.6 


113667 Bone4 Normal 


0.4 


0.7 
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112736 Match 
Control Ulcer 
Col-M 



112423 Psoriasis-F 



3.5 



1.4 



5.3 



113668 Synovium4 
Normal 



1.1 



1 13669 Syn Fluid 
Cells4 Normal 



1.0 



1.0 



«J* JL ««!« J**' 1 z $ 
1.1 



0.7 



Table AD. CNS neurodegeneration vl.O 



Tissue Name 


Rel. 

Exp.(%) 
Ag2100, 
Run 


issue Name 


Rel. 

Exp.(%) 
Ag2100, 
Run 

207929343 


AD 1 JnJppO 


S 9 


Control fPatrri 3 TemDOral Ctx 


8.5 


AD I Hippo 




Control fPatrT) 4 TemDOral Ctx 


55.5 


AD 3 Hippo 


o. / 


AD 1 Orrinitfll Ctx 1 


31.6 


AD 4 Hippo 


7 9 


Af> 9 Occinital Ctx fMissine) 


0.0 


AD 5 Hippo 


inn ft 


AD % Orrinital Ctx 


8.4 


AD D Hippo 




AT) 4 Orrinital Ctx 


28.7 


Control 2 Hiddo 


17.7 


AD 5 Occipital Ctx 


52.5 


Control 4 Hippo 


3.4 


AD 6 Occipital Ctx 


22.8 


Control (Path) 3 Hippo 


4.4 


Control 1 Occipital Ctx 


3.9 


AD 1 Temporal Ctx 


15.7 


Control 2 Occipital Ctx 


64.6 


AD 2 Temporal Ctx 


26.4 


Control 3 Occipital Ctx 


40.6 


AD 3 Temporal Ctx 


12.3 


Control 4 Occipital Ctx 


6.4 


AD 4 Temporal Ctx 


24.3 


Control (Path) 1 Occipital Ctx 


77.9 


AD 5 Inf Temporal Ctx 


65.5 


Control (Path) 2 Occipital Ctx 


28.5 


AD 5 Sup Temporal Ctx 


20.9 


Control (Path) 3 Occipital Ctx 


1.5 


AD 6 Inf Temporal Ctx 


44.1 


Control (Path) 4 Occipital Ctx 


40.9 


AD 6 Sup Temporal Ctx 


59.0 


Control 1 Parietal Ctx 


7.8 


Control 1 Temporal Ctx 


9.5 


Control 2 Parietal Ctx 


34.4 


Control 2 Temporal Ctx 


34.6 


Control 3 Parietal Ctx 


15.8 


Control 3 Temporal Ctx 


0.0 


Control (Path) 1 Parietal Ctx 


68.8 


Control 3 Temporal Ctx \ 


10.4 


Control (Path) 2 Parietal Ctx 


32.3 


Control (Path) 1 Temporal Ctx 


68.8 


Control (Path) 3 Parietal Ctx 


4.9 


Control (Path) 2 Temporal Ctx 


49.7 


Control (Path) 4 Parietal Ctx 


58.6 
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Table AE. Panel 1.3D 



Tissue Name 


Kel. 

Exd %} 
Ag2100, 
Run 

152517508 


Tissue Name 


Kel. 

Exp.(%) 
Ag2100, 
Run 

152517508 


Liver adenocarcinoma 


11.7 


Kidney (fetal) 


1.8 


Pancreas 


0.0 


Renal ca. 786-0 


7.1 


Pancreatic ca. CAPAN 2 


3.2 


Renal ca. A498 


3.7 


Adrenal gland 


1.4 


Renal ca.RXF393 


3.1 


Thyroid 


0.1 


Renal ca. ACHN 


4.4 


Salivary gland 


0.1 


Renal ca.TJO-31 


6.3 


Pituitary gland 


2.1 


Renal ca. TK-10 


3.2 


Brain (fetal) 


2.1 


Liver 


0.0 


Brain (whole) 


24.7 


Liver (fetal) 


3.8 


Brain (amygdala) 


11.2 


Liver ca. (hepatoblast) HepG2 


3.2 j 


Brain (cerebellum) 


2.7 


Lung 


0.3 


Brain (hippocampus) 


36.3 


Lung (fetal) 


0.9 


Brain (substantia nigra) 


1.5 


Lung ca. (small cell) LX-1 


6.6 


Brain (thalamus) 


30.4 


Lung ca. (small cell) NCI-H69 


8.5 


Cerebral Cortex 


100.0 


Lung ca. (s.cell var.) SHP-77 


7.5 


Spinal cord 


2.5 


Lung ca. (large cell)NCI-H460 


0.0 


glio/astro U87-MG 


6.4 


Lung ca. (non-sm. cell) A549 


0.2 


glio/astro U-118-MG 


33.7 


Lung ca. (non-s.cell) NCI-H23 


10.4 


astrocytoma SW1783 


5.9 


Lung ca. (non-s.cell) HOP-62 


1.4 


neuro*; met SK-N-AS 


14.5 


Lung ca. (non-s.cl) NCI-H522 


5.3 


astrocytoma SF-539 


7.4 


Lung ca. (squam.) SW 900 


3.2 


astrocytoma SNB-75 


5.8 


Lung ca. (squam.) NCI-H596 


7.2 


glioma SNB-19 


1.0 


Mammary gland 


0.2 


glioma'U251 


2.4 


Breast ca * (pl.ef) MCF-7 


5.6 


glioma SF-295 


0.9 


Breast ca * (pl.ef) MDA-MB-231 


14.5 


Heart (fetal) 


0.4 


Breast ca.* (pl.ef) T47D 


2.4 


Heart 


0.1 


Breast ca. BT-549 


6.8 


Skeletal muscle (fetal) 


3.4 


Breast ca. MDA-N 


14.0 


Skeletal muscle 


0.1 


Ovary 


2.2 


Bone marrow 


5.4 


Ovarian ca.OVCAR-3 


2.5 


Thymus 


2.1 


Ovarian ca.OVCAR-4 


0.8 


Spleen 


0.6 


Ovarian ca. OVCAR-5 


2.7 


Lymph node 


0.4 


Ovarian ca. OVCAR-8 


3.2 


Colorectal 


1.8 


Ovarian ca.IGROV-1 


2.0 


Stomach 


1.0 


Ovarian ca.* (ascites) SK-OV-3 


7.4 


Small intestine 


1.6 


Uterus 


0.0 


Colon ca. SW480 


13.1 


Placenta 


0.2 
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Colon ca.* SW620(SW480 met) 


4.5 


it...,, A >»>, "ir* j: .it H-t «-n ~v * 

Prostate 


" ""It "H ""^ " 




Colon ca. HT29 


4.1 


Prostate ca.* (bone met)PC-3 


2.0 




Colon ca. HCT-116 


5.0 


Testis 


4.0 




Colon ca. CaCo-2 


5.9 


Melanoma Hs688(A).T 


0.7 




Colon ca. tissue(0DO3866) 


2.8 


Melanoma* (met) Hs688(B).T 


0.3 




Colon ca. HCC-2998 


3.7 


Melanoma UACC-62 


0.5 




Gastric ca * (liver met) NCI-N87 


2.3 


Melanoma M14 


7.2 




Bladder 


0.9 


Melanoma LOXIMVI 


2.8 




Trachea 


0.7 


Melanoma* (met) SK-MEL-5 


5.8 




Kidney 


0.7 


Adipose 


0.2 





Table AF. Panel 2.2 



Tissue Name 


Kel* 
Ep.(%) 
Ag2100, 
Run 

174166901 


Tissue Name 


Kel. 

Exp.(%) 
Ag2100, 
Run 

174166901 


Normal Colon 


6.3 


Kidney Margin (OD04348) 


30.4 


Colon cancer (OD06064) 


13.4 


Kidney malignant cancer 
(OD06204B) 


3.6 


Colon Margin (OD06064) 


9.0 


Kidney normal adjacent tissue 
(OD06204E) 


10.5 


Colon cancer (OD06159) 


4.5 


Kidney Cancer (OD04450-01) 


2.4 


Colon Margin (OD06159) 


5.9 


Kidney Margin (OD04450-03) 


13.3 


Colon cancer (OD06297-04) 


3.8 


Kidney Cancer 8120613 


6.7 


Colon Margin (OD06297-05) 


9.9 


Kidney Margin 8120614 


1.2 


CC Gr.2 ascend colon (OD03921) 


4.4 


Kidney Cancer 9010320 


1.7 


CC Margin (OD03921) 


2.8 


Kidney Margin 9010321 


4.5 


Colon cancer metastasis 
(OD06104) 


1.7 


Kidney Cancer 8120607 


0.5 


Lung Margin (OD06104) 


3.1 


Kidney Margin 8120608 


1.7 


Colon mets to lung (OD04451-01) 


9.6 


Normal Uterus 


LI 


Lung Margin (OD04451-02) 


3.2 


Uterine Cancer 06401 1 


1.5 


Normal Prostate 


1.2 


Normal Thyroid 


0.0 


Prostate Cancer (OD04410) 


0.0 


Thyroid Cancer 064010 


0.6 


Prostate Margin (OD04410) 


0.7 


Thyroid Cancer A302152 


5.3 


Normal Ovary 


2.8 


Thyroid Margin A302153 


0.0 


Ovarian cancer (OD06283-03) 


11.7 


Normal Breast 


3.0 


Ovarian Margin (OD06283-07) 


3.0 


Breast Cancer (OD04566) 


8.1 


Ovarian Cancer 064008 


1.1 


Breast Cancer 1024 


2.9 


Ovarian cancer (OD06145) 


0.9 


Breast Cancer (OD04590-01) 


14.8 


Ovarian Margin (OD06145) 


0.0 


Breast Cancer Mets 
(OD04590-03) 


3.2 
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5 



Ovarian cancer (OD06455-03) 


15.8 


Breast CaLeMtasta/is^ ° 
(OD04655-05) 


5.4 


Ovarian Margin (OD06455-07) 


1.8 


Breast Cancer 064006 




Normal Lung 


1.2 


Breast Cancer 9100266 


2.6 


Invasive DOnr fliff lnno ar1*»nr* 

(ODO4945-01 


8.4 


Breast Margin 9100265 


2.3 


Lung Margin (ODO4945-03) 


1.2 


Breast Cancer A209073 


l.O 


Lung Malignant Cancer 
(OD03126) 


5.0 


Breast Margin A2090734 


2.5 


Lung Margin (OD03 126) 


0.6 


Breast cancer (OD06083) 


17.1 




1 (\ o 


Breast cancer node metastasis 
(OD06083) 


14.7 


Lung Margin (OD05014B) 


9.0 


Normal Liver 


0.4 


Lung cancer (OD06081) 


10.1 


Liver Cancer 1026 


0.0 


Lung Margin (OD06081) 


4.0 


Liver Cancer 1025 


1.8 


Lung Cancer (OD04237-01) 


4.1 


Liver Cancer 6004-T 


1.1 


Lung Margin (OD04237-02) 


2.0 


Liver Ti ssue 6004-N |2.5 


Ocular Melanoma Metastasis 


0.9 


Liver Cancer 6005-T 


1.6 


Ocular Melanoma Margin (Liver) 


0.4 


Liver Tissue 6005-N 


0.0 


Melanoma Metastasis 


10 d 


Liver Cancer 064003 


0.7 


Melanoma Margin (Lung) 


2.0 


Normal Bladder 


2.9 


Normal Kidney 


5.0 


Bladder Cancer 1023 


1.5 


Kidney Ca, Nuclear grade 2 
(OD04338) 


15.4 


Bladder Cancer A302173 


17.8 


Kidney Margin (OD04338) 


5.0 


Normal Stomach 


10.4 


ivianey ca iNUCiear grace i/z \ 
(OD04339) 


100.0 


Gastric Cancer 9060397 


1.1 


Kidney Margin (OD04339) 


9.3 


Stomach Margin 9060396 j 


0.7 


Kidney Ca, Clear cell type 
(OD04340) 


14.0 


Gastric Cancer 9060395 


2.8 


Kidney Margin (ODQ4340) 


11.3 


Stomach Margin 9060394 


2.8 


Kidney Ca, Nuclear grade 3 
(OD04348) 


9.0 


Gastric Cancer 064005 


6.0 


Table AG. Panel 3D 


Tissue Name 


Rel. 

Exp (%) 
Ag2100, 
Run 

164796104 


Tissue Name 


Rel. 

Exp.(%) 
A,g2100, 
Run 

L64796104 


Daoy- Medulloblastoma 


7.3 


Ca Ski- Cervical epidermoid 
carcinoma (metastasis) 


21.0 


TE671- Medulloblastoma 


3.8 


ES-2- Ovarian clear cell carcinoma ] 


11.7 



344 



WO 03/029424 



PCT/US02/31373 



D283 Med- Medulloblastoma 


15.7 


TS cu- JP'iC T step 8 O G , 

Ramos- Stirfiulafed with 

A /l ftTlATTIVP IT» All 

irivj-rv juiiuJiiy liij oil 


Q 11 "1 7- 

<«<# mAlm. iwl< JV , 

10.8 


PFSK-1- Primitive 
Neuroectodermal 


11.2 


Ramos- Stimulated with 

PA/T A /i nnr^m\/r»i n 1/1K 


6.2 


XF-498-CNS 


21.2 


MEG-01- Chronic myelogenous 
leuKcima vroegoKaryoDiastj 


5.8 


SNB-78- Glioma 


11.3 


Rait. RnrVitt'c KrtvmliAtwo 


6.7 j 


SF-268- Glioblastoma 


7.6 


L/auui- Durjutt s jympnoma 


14.8 


T98G- Glioblastoma 


12.0 


U266- B-cell plasmacytoma 


5.1 


oa-in-oii- iNeuroDiastoma 
(metastasis) 


5.6 


CA46- Burkitt's lymphoma 


5.0 


SF-295- Glioblastoma 


12,4 


RL- non-Hodgkin's B-cel! 
lyinpnoma 


3.8 


Cerebellum 


16.2 


JMl-pre-B-cell lymphoma 


11.5 


Cerebellum 


3.6 


Jurkat- T cell leukemia 


12.5 


NCI-H292- Mucoepidermoid 
lung carcinoma 


14.0 


TF-1- Erythroleukemia 


9.9 


DMS-114- Small cell lung 
cancer 


10.4 


HUT 78- T-cell lymphoma 


14 7 

Jl*T. / 


DM6-79- Small cell lung cancer 


100.0 


U937- Histiocytic lymphoma 


8.1 


NCI-H146- Small cell lung 
cancer 


14.3 


KU-812- Myelogenous leukemia 


17 7 


NCI-H526- Small cell lung 
cancer 


19.8 


769-P- Clear cell renal carcinoma 




NCI-N417- Small cell lung 
cancer 


5.8 


Caki-2- Clear cell renal carcinoma 


9 5 


NCI-H82- Small cell lung cancer 


10.2 


SW 839- Clear cell renal carcinoma 


5.2 


NCI-H157- Squamous cell lung 
cancer (metastasis) 


13.8 


G401- Wilms' tumor 


6.3 


NCI-HI 155- Large cell lung 
cancer 


36.1 


Hs766T- Pancreatic carcinoma (LN 
metastasis) 


15.7 


Mr^T TJIOOO T lii*k^ 

iNv^j-riizyy- Large ceu lung 
cancer 


22.7 


CAPAN-1- Pancreatic 
adenocarcinoma (liver metastasis) 


8.6 


NCI-H727- Lung carcinoid 


14.4 


SU86.86- Pancreatic carcinoma 
(liver metastasis) 


14.1 


NC1-UMC-1 1- Lung carcinoid 


25.9 


BxPC-3- Pancreatic 
adenocarcinoma 


9.4 


i^A-i- omaii cell Jung cancer 


11.0 


HPAC- Pancreatic adenocarcinoma 


14.5 


l-ojo-zuj- \»,oion cancer 


12.7 


MIA PaCa-2- Pancreatic carcinoma 


2.6 


KM 12- Colon cancer 


17.2 


'FFAP-1- PflTirrpatio Wiir>tol 
UffnU'l* FaJlCjcallC UUClaJ 

adenocarcinoma 


38.7 


KM20L2- Colon cancer 


7.0 

< 


PANC-1- Pancreatic epithelioid 
ductal carcinoma 


19.5 


NCI-H716- Colon cancer 


19.5 

< 


r24- Bladder carcinma (transitional 
:ell) 


?.o 


SW-48- Colon adenocarcinoma 


10.6 \ 


5637- Bladder carcinoma 


10.5 


SW 1 1 16- Colon adenocarcinoma 


7.7 ] 


TT-1 197- Bladder carcinoma 


1.8 
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LS 174T- Colon adenocarcinoma 


9.8 


UM-UC-3- feltadMc&P ^ 

^ixansiuonai Cell J 


13.3 


SW-948- Colon adenocarcinoma 


1.4 


r\.£,vrr— iMla UUUIljyuo ai ^UII Id. 




SW-480- Colon adenocarcinoma 


7 6 


T-TT 1 ORfL PiKrncar/^riTYio 

jii luou jriDiosarcoma 


li.y 


NCI-SNU-5- Gastric carcinoma 


14.9 


MG-63- Osteosarcoma 


7.3 


KATO IE- Gastric carcinoma 


18.8 


SK-LMS-1- Leiomyosarcoma 
(vulva) 


48.0 


NCI-SNU-16- Gastric carcinoma 


12.6 


SJRH30- Rhabdomyosarcoma (met 
to bone marrow) 


10.2 


JNL1-6NU-1- Gastnc carcinoma 


12.3 


A431- Epidermoid carcinoma 


12.2 


RF-1- Gastric adenocarcinoma 


5.3 


WM266-4- Melanoma 


21.9 


RF-48- Gastric adenocarcinoma 


7.6 


DU 145- Prostate carcinoma (brain 
metastasis) 


0.2 


MKN-45- Gastric carcinoma 


11.7 


MDA-MB-468- Breast 
adenocarcinoma 


5.6 


NCI-N87- Gastric carcinoma 


9.3 


SCC-4- Squamous cell carcinoma 
of tongue 


03 


OVCAR-5- Ovarian carcinoma 


3.0 


SCC-9- Squamous cell carcinoma 
of tongue 


0.3 


RL95-2- Uterine carcinoma 


4.5 


SCC- 15- Squamous cell carcinoma 
of tongue 


0.2 


HelaS3- Cervical 
adenocarcinoma 


9.0 


CAL 27- Squamous cell carcinoma 
of tongue 


19.9 



Table AH. Panel 4P 



5 



Tissue Name 


Rel. 

Exp(%) 
Ag2100, 
Run 

152800279 


Tissue Name 


Rel. 

Exp.(%) 
Ag2100, 
Run 

152800279 


Secondary Thl act 


15.4 


HUVECIL-lbeta 


12.2 


Secondary Th2 act 


11.9 


HUVECIFN gamma 


16.6 


Secondary Trl act 


15.6 


HUVEC TNF alpha + FN gamma 


11.8 | 


Secondary Thl rest 


4.9 


HUVEC TNF alpha + IL4 


11.4 


Secondary Th2 rest 


3.3 


HUVEC IL-11 


8.2 


Secondary Trl rest 


6.0 


Lung Microvascular EC none 


7.3 


Primary Thl act 


13.6 


Lung Microvascular EC TNFalpha 
+ IL-lbeta 


6.3 


Primary Th2 act 


12.0 


Microvascular Dermal EC none 


23.3 


Primary Trl act 


22.2 


Microsvasular Dermal EC 
TNFalpha + 1L-Ibeta 


10.5 


Primary Thl rest 


100.0 


Bronchial epithelium TNFalpha + 
ILlbeta 


0.6 


Primary Th2 rest 


37.9 


Small airway epithelium none 


1.6 
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Primary Trl rest 


29.3 


omaji airwajfTspixiiciiuxii iiN.raipiid 
+ IL-lbeta 


7.4 


kaj** jympnocyie act 




L-oronery driciy ijiviv^ ic&i 


*+.*+ 


CD45RO CD4 lymphocyte act 


15.4 


Coronery artery SMC TNFalpha + 


2.0 


CD8 lymphocyte act 


10.6 


Astrocytes rest 


1.3 ! 


Secondary CD8 lymphocyte rest 


7.9 


Astrocytes TNFalpha + IL-lbeta 


0.5 


Secondary CD8 lymphocyte act 


17.3 


KU-812 (Basophil) rest 


22.4 


CD4 lymphocyte none 


0.5 


KU-812 (Basophil) 
irivi/v l onomy i. m 


28.5 


zry i ni/ J nu i ri__anti-cuy j 
CH11 


17.1 


CCD1106 (Keratinocytes) none 


14.3 


LAK cells rest 


3.6 


CCD1106 (Keratinocytes) 
TNFalpha + IL-lbeta 


18.4 


LAK cells IL-2 


16.8 


Liver cirrhosis 


0.5 


LAKceUsIL-2+IL-12 , 


8.4 


Lupus kidney 


3.3 


LAK cells IL-2+EFN gamma 


16.4 


NC1-H292 none 


29.5 


LAK cells IL-2+ DL-18 


16.8 


NCI-H292 EL-4 


27.7 


T ATT pi*11q Pft/f A/innr>TTivf*in 


0.6 


NCI-H292 IL-9 


32.3 


NK Pells TL-2 rest 


15.3 


NCI-H292 EL- 13 


13.4 


Two Wav MT R dav 


1.8 


NCI-H292 BFN gamma 


11.0 


Two Wav MT R 5 dav 


6.1 


HPAEC none 


8.5 


Two Wav MLR 7 dav 


10.1 


HPAEC TNF alpha + JJL-1 beta 


7.7 


PBMC rest 


0.1 


Lung fibroblast none 


6.3 


PBMCPWM 


25.5 


Lung fibroblast TNF alpha + IL-1 
beta 


9.0 


PBMC PHA-L 


24.0 


Lung fibroblast BL-4 


3.7 


Ramos (B cell) none 


17.7 


.Lung 11 Drool as t iL-y 


J.U 


Ramos (B cell) ionomycin 


OO A 


bung iiDroDiast Uu-ij 


1 7 

i. / 


B lymphocytes rWM 


4o.O 


Lung fibroblast 1FN gamma 




B lymphocytes CD4UL ana 1L-4 


10.4 


jjermai IiDroDiast \^\jxji\j/\j rest 




EOL-1 dbcAMP 


10,5 


jjermai iioroDiasi ^-.v^i-fiu/u un.t 
alpha 


79.0 


EOL-1 dbcAMP 
PMA/ionomycin 


7.0 


Dermal fibroblast CCD1070 IL-1 
beta 


21.8 


Dendritic cells none 


0.5 


Dermal fibroblast IFN gamma 


22.2 


Dendritic cells LPS 


0.0 


Dermal fibroblast IL-4 


45.7 


Dendritic cells anti-CD40 


0.0 


IBD Colitis 2 


0.9 


Monocytes rest 


0.2 


IBD Crohn's 


1.0 


Monocytes LPS 


0.0 


Colon 


3.7 


Macrophages rest 


4.4 


Lung 


1.5 


Macrophages LPS 


0.6 


Thymus 


13.0 


HUVEC none 


24.7 


Kidney 


31.2 


HUVEC starved 


43.5 
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Table AI. Panel CNS 1 



Tissue Name 


Rel Exp.(%) 
Ag2100, 

JvUII 

171649357 


Tissue Name 


Pol 

ExoX%) 
Ag2100, 
Run 

171649357 


BA4 Control 


23.8 


BA17 PSP 


35.4 


BA4 Control2 


19.1 


BA17 PSP2 


18.3 


B A4 Alzheimer's2 


7.3 


Sub Nigra Control 


11.6 


B A4 Parkinson's 


43.8 


Sub Nigra Control2 


5.0 


BA4Parkinson's2 


60.7 


Sub Nigra Alzheimer^ 


4.6 


BA4 Huntington's 


233 


Sub Nigra Parkinson's2 


11.8 


BA4 Huntington's2 


14.7 


Sub Nigra Huntington's 


16.0 


BA4PSP 


13.8 


Sub Nigra Huntington's2 


8.8 


BA4PSP2 


26.2 


Sub Nigra PSP2 


1.7 


BA4 Depression 


15.4 


Sub Nigra Depression 


in 


BA4 Depression2 


17.0 


Sub Nigra Depression2 


8.0 


BA7 Control 


36.6 


GlobPalladus Control 


8.4 


BA7 Control2 


17.4 


GlobPalladus Control2 


10.8 


BA7 Alzheimer^ 


11.3 


Glob Palladus Alzheimer's 


1.8 


BA7 Parkinson's 


21.9 


Glob Palladus Alzheimer's2 


8.3 


BA7 Parkinson's2 


36.1 


Glob Palladus Parkinson's 


51.1 


BA7 Huntington's 


56.3 


Glob Palladus Parkinson's2 


12.9 


BA7 Huntington's2 


45.1 


Glob Palladus PSP 


9.3 


BA7PSP 


44.4 


GlobPalladus PSP2 


9.9 


BA7PSP2 


17.6 


Glob Palladus Depression 


6.0 


BA7 Depression 


8.5 


Temp Pole Control 


9.8 


BA9 Control 


31.9 


Temp Pole Control2 


21.5 


BA9 Control2 


34.4 


Temp Pole Alzheimer's 


6.6 ! 


BA9 Alzheimer's 


8.0 


Temp Pole Alzheirner's2 


8.1 


BA9 Alzheimer's2 


20.0 


Temp Pole Parkinson's 


33.0 


BA9 Parkinson's 


40.6 


Temp Pole Parkinson's2 


24.8 


BA9 Parkinson's2 


31.4 


Temp Pole Huntington's 


33.2 


BA9 Huntington's 


41.5 


Temp Pole PSP 


8.8 


BA9 Huntington's2 


21.8 


Temp Pole PSP2 


6.0 


BA9PSP 


17.8 


Temp Pole Depression2 


17.0 


BA9PSP2 


8.2 


Cing Gyr Control 


23.3 | 


B A9 Depression 


10.5 


Cing Gyr Control2 


17.8 


BA9 Depression2 


16.2 


Cing Gyr Alzheimer's 


7.3 


BA17 Control 


58.2 


Cing Gyr Alzheimer's2 


10.4 


BA17 Control 


41.8 


Cing Gyr Parkinson's 


13.4 


BA17 Alzheimer's2 


27.0 


Cing Gyr Parkinson's2 


17.0 


BA17 Parkinson's 


58.6 


Cing Gyr Huntington's 


28.3 
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B A 17 Parkinson's2 


69.3 


Cin g Gyriu^y a °- G ^ 


.^L J j? 'J! it 


BA17 Huntington's 


44.4 


Cing Gyr PSP 


7.2 


BA17Huntington's2 


31.9 


CingGyrPSP2 


4.0 


BAl 7 Depression 


13.6 


Cing Gyr Depression 


6.9 


BA17 Depression2 


100.0 


Cing Gyr Depression2 


10.4 



AI.05 chondrosarcoma Summary: Ag2100 Highest expression of this gene is 
detected in untreated serum starved chondrosarcoma cell line (SW1353) (CT=27). 
5 Interestingly, expression of this gene appears to be somewhat down regulated upon IL-1 
treatment, a potent activator of pro-inflammatory cytokines and matrix metalloproteinases 
which participate in the destruction of cartilage observed in Osteoarthritis (OA). 
Modulation of the expression of this transcript in chondrocytes by either small molecules or 
antisense might be important for preventing the degeneration of cartilage observed in OA 

10 AI_comprehensive paneLvl.0 Summary: Ag2100 Highest expression of this 

gene is detected in osteoarthritis (OA) bone (CTs=27-28). This gene is highly expressed in 
bone isolated from 5 different osteoarthritic (OA) patients, synovium in 3 out of 5 OA 
patients, but not in cartilege from OA patients nor in any tissues from rheumatoid arthritis 
(RA) patients or control samples. Thus, small molecule therapeutics designed against the 

15 protein encoded for by this gene could reduce or inhibit inflammation. Anti-sense 

therapeutics that would block the translation of the transcript and protein production could 
also inhibit inflammatory processes. These types of therapeutics could be important in the 
treatment of diseases such as osteoarthritis 

CNS_neurodegeneration_vLO Summary: Ag2100 This panel confirms the 
20 expression of this gene at low levels in the brains of an independent group of individuals. 
However, no differential expression of this gene was detected between Alzheimer's 
diseased postmortem brains and those of non-demented controls in this experiment. Please 
see Panel 1.3D for a discussion of this gene in treatment of central nervous system 
disorders. 

25 Panel 1.3D Summary: Ag2100 Expression of this gene is highest in cerebral 

cortex (CT = 26.3). This gene is expressed at moderate levels in all the regions of the CNS 
including amygdala, cerebellum, hippocampus, substantia nigra, thalamus, spinal cord, and 
fetal brain. This gene encodes a protein with homology to citron-kinase. Citron-kinase 
(Citron-K) has been proposed by in vitro studies to be a crucial effector of Rho in 

30 regulation of cytokinesis. Citron-K is essential for cytokinesis in vivo in specific neuronal 
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precursors and may play a fundamental role in specific ll^rlan^^^ 
the CNS (Di Cunto et al., 2000, Neuron 28:115-127, PMID: 1 1086988). General inhibitors 
of the RHO/RAC-INTERACTING CITRON KINASE family disrupt endothelial tight 
junctions, suggesting that specific modulators of this brain-preferential family member 
5 could be useful in delivery of therapeutics across the blood brain barrier. These general 
inhibitors also influence intracellular calcium flux, which is a central component of many 
important neuronal processes, such as apoptosis, neurotransmitter release and signal 
transduction (Jezior et al., 2001, Br. J. Pharmacol. 134:78-87, PMID: 11522599; Walsh et 
al., 2001, Gastroenterology 121:566-579, PMID: 11522741). Thus, modulators of the 

10 function of the protein encoded by this gene may prove useful in the treatment of 
neurodegenerative disorders involving apoptosis, such as spinal muscular atrophy, 
Alzheimer's disease, Huntington's disease, Parkinson's disease, and others. Diseases 
involving neurotransmitters or signal transduction, such as schizophrenia, mania, stroke, 
epilepsy and depression may also benefit from agents that modulate the function of the this 

15 gene product. 

This gene also shows moderate to low expression in several metabolic tissues 
including adrenal gland, pituitary gland, gastrointestinal tract, fetal heart, fetal skeletal 
muscle and fetal liver. Therefore, therapeutic modulation of the activity of this gene may 
prove useful in the treatment of endocrine/metabolically related diseases, such as obesity 
20 and diabetes. 

Interestingly, expression of this gene is higher in fetal tissues (CTs=31) as 
compared to the corresponding adult liver, and skeletal muscle (CTs=37-40). This 
observation suggests that expression of this gene can be used to distinguish fetal from adult 
liver and skeletal muscle. In addition, the relative over-expression of this gene in fetal tissue 
25 suggests that the protein product may enhance liver and muscle growth or development in 
the fetus and thus may also act in a regenerative capacity in the adult. Therefore, 
therapeutic modulation of the protein encoded by this gene could be useful in treatment of 
liver and skeletal muscle related diseases. 

Moderate levels of expression of this gene is also seen in cluster of cancer cell lines 
30 derived from pancreatic, gastric, colon, lung, liver, renal, breast, ovarian, prostate, 

melanoma and brain cancers. Thus, therapeutic modulation of the expression or function of 
this gene may be effective in the treatment of pancreatic, gastric, colon, lung, liver, renal, 
breast, ovarian, prostate, melanoma and brain cancers. 
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Panel 2.2 Summary: Ag2100 Expression of thil|SnSs4iySBS la'dSfe^fi^r 31 
sample (CT=28). In addition, significant expression of this gene is also seen in a number of 
normal and cancer tissues including colon, lung, ovary, breast, kidney, thyroid, liver, 
bladder, and stomach. Interestingly, this gene is expressed at slightly higher levels in most 
5 of the tumors than in the normal matched tissue. Thus, expression of this gene could be 
used to distinguish between cancerous tissue and normal tissue. In addition, therapeutic 
modulation of this gene product, through the use of small molecule drugs or antibodies, 
might be of benefit in the treatment of cancer. 

Panel 3D Summary: Ag2100 Expression of this gene is highest in a lung cancer 
10 cell line (CT = 26). However, low to moderate expression is also seen in the majority of 
cancer cell lines on this panel, suggesting that this gene may play an important role in many 
cell types. 

Panel 4D Summary: Ag2100 Highest expression of this gene is detected in resting 
primary Thl cells (CT=24.5). Moderate to low levels of expression of this gene is seen in 

15 members of the T-cell, B-cell, endothelial cell, macrophage/monocyte, and peripheral 

blood mononuclear cell family, as well as epithelial and fibroblast cell types from lung and 
skin, and normal tissues represented by colon, lung, thymus and kidney. Interestingly, this 
gene is highly induced in Ramos B cells treated with PMA and ionomycin, in 
non-transformed B cells and PBMC treated with PWM. All three of these observations are 

20 consistent with this gene being induced in B cells after activation. This gene product has 
homology to the RHO/RAC-interacting citron kinase. Thus citron kinase encoded by this 
gene may play an important role in T cell activation, by regulating TCR-mediated T cell 
spreading, chemotaxis and other chemokine responses and in apoptosis. Likewise, this 
putative kinase may also be important in B cell motility, antigen receptor mediated 

25 activation and apoptosis. 

Small molecule therapeutics designed against the protein encoded for by this gene 
could reduce or inhibit inflammation. Anti-sense therapeutics that would block the 
translation of the transcript and protein production could also inhibit inflammatory 
processes. These types of therapeutics could be important in the treatment of diseases such 

30 as osteoarthritis. Likewise, these therapeutics could be important in the treatment of 

asthma, psoriasis, diabetes, and IBD, which require activated T cells, as well as diseases 
that involve B cell activation such as systemic lupus erythematosus. 
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Pimel CNS JL Summary: Ag2100 This panel 
at low levels in the brains of an independent group of individuals. Please see Panel 1.3D for 
a discussion of this gene in treatment of central nervous system disorders. 

B. CG117662-02: Renal renin precursor like. 

5 Expression of gene CGI 17662-02 was assessed using the primer-probe sets Ag2078 

and Ag5185, described in Tables BA and BB. Results of the RTQ-PCR runs are shown in 
Tables BC, BD, BE, BF and BG. 

Table BA. Probe Name Ag2078 

10 



Primers 


Sequence 


Length 


Start 
Position 


[SEQID 
|No 


Forward 


5 1 -accttcaaagtcgtctttgaca-3 • 


22 


292 


|252 


Probe 


TET-5 1 -ctccaagtgcagccgtctctacactg 
-3 ' -TAMRA 


26 


342 


K53 


Reverse 


5 1 -cgaagagcttgtgatacacaca-3 ' 


22 


370 


j254 



Table BB. Probe Name Agglgg 

15 



Primers 


Sequence 


Length 


Start 
Position 


SEQID 
No 


Forward 


5 1 -ccgtgtctgtggggtcat-3 ' 


18 


491 


255 


Probe 


TET-5 ' -attggtagacaccggtgcatcctaca 
-3 ' -TAMRA 


26 


540 


256 


Reverse 


5 ' -tggagctggtagaacctgaga-3 • 


21 


566 


257 



Table BC. CNS neurodegeneration vl.O 

20 



Tissue Name 


Rel. 

Exp.(%) 
Ag5185, 
Run 

226559655 


issue Name 


Rel. 

Exp.(%) 
Ag5185, 
Run 

226559655 


AD 1 Hippo 


5.7 


Control (Path) 3 Temporal Ctx 


48.6 


AD 2 Hippo 


82.4 


Control (Path) 4 Temporal Ctx 


54.3 ! 


AD 3 Hippo 


11.4 


AD 1 Occipital Ctx 


12.2 


AD 4 Hippo 


50.0 


AD 2 Occipital Ctx (Missing) 


0.0 


AD 5 Hippo 


22.5 


AD 3 Occipital Ctx 


18.8 
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AD 6 Hippo 



15.2 



AD 4 



373 



Control 2 Hippo 



9.6 



AD 5 Occipital Ctx 



12.1 



Control 4 Hippo 



18.3 



AD 6 Occipital Ctx 



25.0 



Control (Path) 3 Hippo 



85.3 



Control 1 Occipital Ctx 



26.2 



AD 1 Temporal Ctx 



38.4 



Control 2 Occipital Ctx 



3.6 



AD 2 Temporal Ctx 



74.7 



Control 3 Occipital Ctx 



40.6 



AD 3 Temporal Ctx 



0.0 



Control 4 Occipital Ctx 



20.9 



AD 4 Temporal Ctx 



49.0 



Control (Path) 1 Occipital Ctx 



39.2 



AD 5 Inf Temporal Ctx 



31.6 



Control (Path) 2 Occipital Ctx 



18.3 



AD 5 Sup Temporal Ctx 



36.3 



Control (Path) 3 Occipital Ctx 



0.0 



AD 6 Inf Temporal Ctx 



55.5 



Control (Path) 4 Occipital Ctx 



0.0 



AD 6 Sup Temporal Ctx 



63.3 



Control 1 Parietal Ctx 



46.7 



Control 1 Temporal Ctx 



100.0 



Control 2 Parietal Ctx 



0.0 



Control 2 Temporal Ctx 



40.6 



Control 3 Parietal Ctx 



12.2 



Control 3 Temporal Ctx 



47.0 



Control (Path) 1 Parietal Ctx 



65.5 



Control 3 Temporal Ctx 



24.7 



Control (Path) 2 Parietal Ctx 



23.8 



Control (Path) 1 Temporal Ctx 



50.7 



Control (Path) 3 Parietal Ctx 



0.0 



Control (Path) 2 Temporal Ctx 



65.5 



Control (Path) 4 Parietal Ctx 



57.4 



Table BP. General screening panel vl-5 

5 



Tissue Name 


ReL 

Exp.(%) 
Ag5185, 
Run 

228757766 


issue Name 


ReL 

Exp.(%) 
Ag5185, 
Run 

228757766 


Adipose 


1.0 j 


Renal ca. TK-10 


0.0 


Melanoma* Hs688(A).T 


0.2 


Bladder 


0.5 


Melanoma* Hs688(B).T 


0.1 


Gastric ca. (liver met.) NCI-N87 


1.1 


Melanoma* M14 


0.1 


Gastric ca. KATO in 


0.3 


Melanoma* LOXIMVI 


0.1 


Colon ca. SW-948 


18.2 


Melanoma* SK-MEL-5 


0.2 


Colon ca. SW480 


0.6 


Squamous cell carcinoma SCC-4 


0.4 


Colon ca.* (SW480 met) SW620 


0.5 


Testis Pool 


8.4 


Colon ca. HT29 


1.6 


Prostate ca * (bone met) PC-3 


1.5 


Colon ca.HCT-116 


0.5 


Prostate Pool 


0.6 


Colon ca. CaCo-2 


0.2 


Placenta 


3.0 


Colon cancer tissue 


2.6 


Uterus Pool 


1.5 


Colon ca. SW1116 


0.1 


Ovarian ca. OVCAR-3 


0.9 


Colon ca. Colo-205 


0.0 | 


Ovarian ca. SK-OV-3 


0.2 


Colon ca. SW-48 


0.8 


Ovarian ca. OVCAR-4 


0.7 


Colon Pool 


4.7 


Ovarian ca. OVCAR-5 


4.7 


Small Intestine Pool 


4.0 


Ovarian ca. IGROV-1 


0.0 


Stomach Pool 


2.3 
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Ovarian ca. OVCAR-8 



0.2 



g ± 3 



Ovary 



.6.7 



Fetal Heart 



0.2 



Breast ca. MCF-7 



0.5 



Heart Pool 



1.6 



Breast ca. MD A -MB -231 



0.6 



Lymph Node Pool 



12.6 



Breast ca. BT 549 



0.2 



Fetal Skeletal Muscle 



0.3 



Breast ca. T47D 



2.1 



Skeletal Muscle Pool 



0.4 



Breast ca. MDA-N 



0.0 



Spleen Pool 



0.1 



Breast Pool 



5.0 



Thymus Pool 



3.8 



Trachea 



1.0 



CNS cancer (glio/astro) U87-MG 



0.0 



Lung 



22.1 



CNS cancer (glio/astro) U-118-MG 



0.1 



Fetal Lung 



0.6 



CNS cancer (neuro;met) SK-N-AS 



0.0 



Lungca. NCI-N417 



0.4 



CNS cancer (astro) SF-539 



0.0 



Lung ca. LX-1 
Lungca.NCI-H146 



0.3 
0.0 



CNS cancer (astro) SNB-75 



0.0 



CNS cancer (gJio) SNB-19 



0.3 



Lungca. SHP-77 



0.1 



CNS cancer (glio) SF-295 



0.5 



Lung ca. A549 



0.0 



Brain (Amygdala) Pool 



0.4 



Lungca. NCI-H526 



0.5 



Brain (cerebellum) 



0.5 



Lung ca. NCI-H23 



Lung ca. NCI-H460 



1.4 



Brain (fetal) 



0.0 



2.0 



Brain (Hippocampus) Pool 



0.2 



Lung ca. HOP-62 



0.1 



Cerebral Cortex Pool 



0.3 



Lungca. NCI-H522 



0.6 



Brain (Substantia nigra) Pool 



0.3 



Liver 



1.0 



Brain (Thalamus) Pool 



0.6 



Fetal Liver 



1.0 



Brain (whole) 



0.8 



Liver ca. HepG2 



0.0 



Spinal Cord Pool 



0.5 



Kidney Pool 



4.2 



Adrenal Gland 



2.6 



Fetal Kidney 



100.0 



Pituitary gland Pool 



0.6 



Renal ca. 786-0 



0.0 



Salivary Gland 



0.5 



Renal ca. A498 



0.0 



Thyroid (female) 



0.1 



Renal ca. ACHN 



0.2 



Pancreatic ca. CAPAN2 



0.2 



Renal ca. UO-31 



0.3 



Pancreas Pool 



4.9 



Table BE. Panel 1.3D 

5 



Tissue Name 


ReL 

Exp.(%) 

g2078, 

Run 

16562668 
4 


ReL 

Exp.(%) 
Ag2078, 
Run 

16562749 
6 


ReL 

Exp.(%) 

Ag2078, 

Run 

1656781 

22 


Tissue Name 


Rel. 

Exp.(%) 
Ag2078, 
Run 

16562668 
4 


ReL 

Exp.(%) 
Ag2078, 
Run 

16562749 
6 


Rel. 

Exp.(%) 
Ag2078, 
Run 

16567812 
2 


Liver 

adenocarcinoma 


0.0 


0.1 


0.1 


Kidney (fetal) 


100.0 


100.0 


100.0 


Pancreas 


0.0 


0.0 


0.0 


Renal ca. 786-0 


0.0 


0.0 


0.0 


Pancreatic ca. 
CAP AN 2 


0.0 


0.0 


0.2 


Renal ca. A498 


0.0 


0.0 


0.1 
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Adrenal gland 


0.5 


0.5 j 


0.3 


Renal ca. RXiF 1 
393 


1) 7 "Ui 
0.0 


fUC,' 1 "~ 
0.0 


0.0 


Thyroid 


0.0 


0.0 j 


0.0 


Renal ca. 
ACHN 


0.0 


0.0 


0.0 


Salivarv pland 


0.0 


0.1 


0.0 


Renal ca. 
UO-31 


0.0 


0.0 


0.1 


Pitiiitarv pland 


0.0 


0.2 


0.0 


Renal ca. 
TK-10 


0.0 


0.1 


0.0 


Brain (fetal) 


0.0 


0.0 


0.0 


Liver 


0.3 


0.3 


0.0 


Brain (whole) 


0.0 


0.0 


0.1 


Liver (fetal) 


0.6 


0.7 


0.5 


Brain 

(amygdala) 


0.1 


0.0 


0.0 


Liver ca. 

(hepatoblast) 

HepG2 


0-0 


0.0 


0.0 


Brain 

(cerebellum) 


0.1 


0.0 


0.1 


Lung 


0.0 


0.0 


0.1 


Brain 

(hippocampus) 


0.0 


0.3 


0.0 


Lung (fetal) 


0.1 


0.1 


0.0 


Brain 

(substantia 

nigra) 


0.0 


0.0 


0.1 


Lung ca. (small 
cell)LX-l 


0.0 


0.0 


0.0 


Brain 
(thalamus) 


0.1 


0.0 


0.1 


Lung ca. (small 
cell)NCI-H69 


0.0 


0.0 


0.0 


Cerebral Cortex 


0.0 


0.0 


0.2 


Lung ca. (s.cell 
van) SHP-77 


0.0 


0.1 


0.0 


Spinal cord 


0.0 


0.0 


0.0 


Lung ca. (large 
cell)NCL-H460 


0.0 


0.0 


0.0 


glio/astro 
U87-MG 


0.0 


0.0 


0.0 


T llTKT ^51 
bUllg ld> 

(non-sm. cell) 
A549 


0.0 


0.0 


0.0 


glio/astro 
U-118-MG 


0.0 


0.0 


0.0 


T .nnp - ra 

(non-s.cell) 
NCI-H23 


0.0 


0.0 


0.0 


astrocytoma 
SW1783 


0.0 


0.0 


0.1 


Lhiip ca 

(non-s.cell) 

HOP-62 


0.1 


0.0 


0.0 


neuro*; met 
SK-N-AS 


0.0 


0.1 


0.0 


Limp ca. 

(non-s.cl) 

NCI-H522 


0.0 


0.0 


0.0 


astrocytoma 


0.0 


0.0 


0.0 


Lung ca. 
(squam.) SW 
900 


0.1 


0.1 


0.0 


astrocytoma 
SNB-75 


0.0 


0.0 


0.2 


Lung ca. 
(squam.) 
NCI-H596 


0.0 


0.0 


0.0 


glioma SNB-19 


0.0 


0.0 


0.0 


Mammary 
gland 


0.2 


0.2 


0.1 


glioma U251 


0.0 


0.0 


0.0 


Breast ca.* 
(pl.ef)MCF-7 


0.0 


0.0 


0.1 
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glioma SF-295 



Heart (fetal) 



0.0 



0.0 



0.0 



0.0 



0.0 



0.0 



Breast ca.* 
(pl.ef) 

MDA-MB-231 



Breast ca.* 
(pl.ef)T47D 



0.1 



0.1 



0.0 



0.0 



0.0 



0.0 



Heart 



0.0 



Skeletal muscle 
(fetal) 



0.0 



0.0 



0.0 



0.0 



0.0 



Breast ca. 
BT-549 

Breast ca. 
MDA-N 



0.0 



0.0 



0.0 



0.0 



0.0 



0.0 



Skeletal muscle 



0.0 



0.0 



0.0 



Ovary 



0.6 



0.8 



0.6 



Bone marrow 



0.0 



0.0 



0.0 



Ovarian ca. 
OVCAR-3 



0.1 



0.1 



0.0 



Thymus 



0.0 



0.0 



0.0 



Ovarian ca. 
OVCAR-4 



0.0 



0.1 



0.0 



Spleen 



0.0 



0.0 



0.0 



Ovarian ca. 
OVCAR-5 



0.2 



0.2 



0.1 



Lymph node 



0.0 



0.1 



0.0 



Ovarian ca. 
OVCAR-8 



0.0 



0.0 



0.0 



Colorectal 



0.0 



0.0 



0.0 



Ovarian ca. 
IGROV-1 



0.0 



0.0 



0.0 



Stomach 



0.0 



Small intestine 



0.1 



0.0 



0.1 



Ovarian ca.* 

(ascites) 

SK-OV-3 



0.0 



0.0 



0.0 



Uterus 



1.7 



0.0 



1.1 



0.0 



1.1 



Colon ca. 
SW480 



0.0 



0.0 



0.0 



Placenta 



0.7 



1.2 



0.7 



Colon ca.* 
SW620(SW480 
met) 



0.0 



0.0 



0.0 



Prostate 



0.1 



0.0 



0.1 



Colon ca. HT29 



0.2 



0.3 



0.3 



Prostate ca.* 
(bone met)PC-3 



0.2 



0.2 



0.0 



Colon ca. 
HCT-116 



0.0 



0.0 



0.0 



Testis 



0.2 



0.1 



0.2 



Colon ca. 
CaCo-2 



0.0 



0.0 



0.0 



Melanoma 
Hs688(A).T 



0.0 



0.0 



0.0 



Colon ca. 
tissue(OD0386 
6) 



0.2 



Colon ca. 
HCC-2998 



0.1 



0.1 



0.3 



0.5 
0.1 



Melanoma* 

(met) 

Hs688(B).T 



0.0 



Melanoma 
UACC-62 



0.0 



0.0 
0.0 



0.0 
0.0 



Gastric ca.* 
(liver met) 
NCI-N87 



0.0 



0.1 



0.0 



Melanoma M14 



0.0 



0.0 



0.0 



Bladder 



Trachea 



Kidney 



0.0 



0.1 



11.2 



0.0 



0.0 



Melanoma 
LOXIMVI 



0.0 



0.0 



10.8 



0.0 



8.7 



Melanoma* 
(met) 

SK-MEL-5 
Adipose 



0.0 



0.0 



0.0 



0.0 

oT 



0.1 



0.0 



0.0 
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Table BF. Panel 4D 

5 



Tissue Name 


Rel. 
xp.(%) 
Ag2078, 
Run 

161905846 


Tissue Name 


Rel. 

Exp.(%) 
Ag2078, 

Run 

I\UU 

161905846 


occonuary ini act 


VJ.KJ 


T-TT TVPP TT -1hpta 

XI \J V E>V-> 1 1 ~ l VCld. 


no 


oeconaary I nz act 


n ft 
U.U 


xiu vel jLriN gamma 


ft ft 

u.u 


Secondary Trl act 


0.0 


HUVEC TNF alpha + EFN gamma 


0.0 


Secondary Thl rest 


0.0 


HUVEC TNF alpha + IL4 


0.0 


Secondary Th2 rest 


0.0 


HUVEC JL-1 1 


0.0 


Secondary Trl rest 


0.0 


Lung Microvascular EC none 


0.2 


Primary Thl act 


0.0 


Lung Microvascular EC TNFalpha 
+ JL-lbeta 


0.2 


Primary Th2 act 


0.0 


Microvascular Dermal EC none 


0.3 


Primary Trl act 


0.0 


Microsvasular Dermal EC 
TNFalpha + EL-lbeta 


0.1 


Primary Thl rest 


0.0 


Bronchial epithelium TNFalpha + 
DLlbeta 


0.1 


Primary Th2 rest 


0.0 


Small airway epithelium none 


0.0 


Primary Trl rest 


0.0 


Small airway epithelium TNFalpha 
-fIL-lbeta 


05 


CD45RA CD4 lymphocyte act 


0.8 


Coronery artery SMC rest 


0.1 


CD45RO CD4 lymphocyte act 


0.0 


Coronery artery SMC TNFalpha + 
IL.-1 oeia 


0.1 


/ *I'"\0 1.,,L,..l..,JL.ll ■ 

CDo lymphocyte act 


ft ft 

u.u 


Astrocytes rest 


ft 0 


Secondary CD 8 lymphocyte rest 


ft ft 

u.u 


Astrocytes i rN.raipna + ll.- i ixjui 


ft 1 

V.I 


Secondary CD8 lymphocyte act 


ft ft 
u.u 


Jvu-oiz ^jDasoprou resi 


no 


CD4 lymphocyte none 


0.0 


JS.U oiz loasopmij 
PMA/ionomvcin 


0.0 


2ry Thl/rh2/Trl_anti<3>95 
CH11 


0.0 


CCD1 1 06 (Keratmocytes) none 


u.u 


LAX cells rest 


0.0 


CCD1 106 (Keratinocytes) 
TNFalpha + IL-lbeta 


0.0 


LAK cells IL-2 


0.0 


Liver cirrhosis 


0.4 


LAK cells IL-2+IL-12 


0.0 


Lupus kidney 


3.9 


LAK cells IL-2+IFN gamma 


0.0 


NCI-H292 none 


1.3 


LAK cells IL-2+IL-18 


0.0 


NCI-H292 JL-4 


0.5 


LAK cells PMA/ionomycin 


0.1 


NCI-H292 EL-9 


1.9 


NK Cells BL-2 rest 


0.0 


NCI-H292 IL-13 


0.3 


Two Way MLR 3 day 


0.0 


NCI-H292 EFN gamma 


1.0 


Two Way MLR 5 day 


0.0 


HPAEC none 


0.0 


Two Way MLR 7 day 


0.0 


HPAEC TNF alpha + JL-1 beta 


0.0 
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PBMC rest 


0.1 


Lu ngfi broblCTe' /USa0ig - 


f" 7 \ i B'FIj 

O'O *** mm 


PBMC PWM 


0.0 


Lung fibroblast TNF alpha + 1L-1 
beta 


0.0 


PBMC PHA-L 


0.0 


Lung fibroblast BL-4 


0.0 


ixuiiios \D \*cnj none 


n n 


LfUng nDroDiasi ixj~y 


u.u 


ixdinos \Q ucii j luiionjycin 


n n 
u.u 


T unrr fiKrr\K!»ct TT -l^ 


a n 
u.u 


0 1ympij0cy1.es r wivi 


a n 
u.u 


-Lung noroDiasi urrs garnrna 


U.U ! 


d lympnocyies ^JL/*tvi-» ana ii-»-*t 


n n 


Honnal fiVirnrtlact fVTII fYlH root 

ivermai noroDiasi ka^u i\j /u rest 




EOL-1 dbcAMP 


0.0 


uermai iiDroDjast cljl/iv/u iiNr 
alpha 


4.5 


EOL-1 dbcAMP 
PMA/ionomycin 


0.2 


Dermal fibroblast CCD 1070 DL-1 
beta 


3.1 


Dendritic cells none 


0.0 


Dermal fibroblast IFN gamma 


0.0 


Dendritic cells LPS 


0.0 


Dermal fibroblast IL-4 


0.0 


Dendritic cells anti-CD40 


0.0 


IBD Colitis 2 


0.0 


Monocytes rest 


0.0 


IBD Crohn's 


0.0 


Monocytes LPS 


0.0 


Colon 


0.0 


Macrophages rest 


0.0 


Lung 


0.2 


Macrophages LPS 


0.0 


Thymus 


100.0 


HUVEC none 


0.4 


Kidney 


0.4 


HUVEC starved 


0.2 







Table BG. Panel 5D 



Tissue Name 


ReL 
Ex.(%) 
Ag2078, 
Run 

168095527 


Tissue Name 


ReL 

Exp.(%) 
Ag2078, 
Run 

168095527 


97457_Patient-02go_adipose 


11.7 


94709 JDonor 2 AM - A_adipose 


0.0 


97476JPatient-07sk_skeletal 
muscle 


0.0 


94710_Donor 2 AM - B_adipose 


0.0 ' 


97477JPatient-07ut_uterus 


2.8 


9471 l_Donor 2 AM - C_adipose 


0.0 


97478_Patient-07pLplacenta 


12.9 


94712JDonor 2 AD - A_adipose 


1.0 


9748 l_Patient-08sK^skeletal 
muscle 


0.0 


94713_Donor 2 AD - B_adipose 


0.0 


97482_J > atient-08ut_uterus 


22.8 


94714J)onor 2 AD - C_adipose 


0.0 


97483_Patient-08pLplacenta 


4.5 


94742 JDonor 3 U - A Jtf esenchymal 
Stem Cells 


0.0 


97486JPatient-09sk_skeletal 
muscle 


0.0 


94743 JDonor 3 U - B Jtfesenchymal 
Stem Cells 


0.0 


97487J > atient-09ut - uterus 


0.0 


94730JDonor 3 AM - A^adipose 


0.9 


9748 8 JPatient-09pl_placenta 


2.7 


94731 JDonor 3 AM - B_adipose 


0.0 


97492_Patient-10ut_uterus 


100.0 


94732 JDonor 3 AM - C__adipose 


0.0 
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y /4y j_.rauenw upi_placeiiifl 




94733 i»JBCT4K£MF /: 


S1371 
u.u 


97495_Patient-l lgo_adipose 


6.0 


94734 JDonor 3 AD - B_adipose 


0.0 


97496 JPatient-1 1 sk_skeletal 
muscle 


0.0 


94735_Donor 3 AD - C_adipose 


0.0 


97497_Patient-l luMiterus 


12.8 


77 138JLiverJHepG2untreated 


0.0 


97498^^^-1 lpLplacenta 


8.5 


73556_Heart_Cardiac stromal cells 
(primary) 


0.0 


97500_Patient-12go_adipose 


87.1 


81735_Small Intestine 


0.0 


9750LPatient-12sk^skeletal 
muscle 


0.0 


72409 JCidneyJProximal Convoluted 
Tubule 


0.0 


97502 _Patient-l 2ut_uterus 


4.6 


82685_Small intestineJDuodenum 


0.0 


97503J > atient-12pl_placenta 


8.0 


90650_Adrenal_Adrenocortical 
adenoma 


0.0 


94721_Donor2U- 
A_Mesenchymal Stem Cells 


0.0 


72410JCidneyJRCE 


1.1 


94722_Donor2U- 
B_Mesenchymal Stem Cells 


0.0 


72411JCidneyJiRE 


5.3 


94723 JDonor2U- 
C_Mesenchymal Stem Cells 


0.0 


73139JJtenisJJterine smooth 
muscle cells 


2.4 



CNS jeurodegenerationjvl.O Summary: Ag5185 Low levels of expression of 
this gene is seen in control temporal cortex and in a hippocampus sample from an 
5 Alzheimer patient (CTs=34.6-34.9). Therefore, therapeutic modulation of this gene may be 
useful in the neurological disorders including seizure and memory related diseases. 

General jscreening_panel_vl.5 Summary: Ag5185 Highest expression of this 
gene is detected in fetal kidney (CT=26.7). Interestingly, expression of this gene is higher 
in fetal as compared to adult kidney (CT=31). This observation suggests that expression of 
10 this gene can be used to distinguish fetal from adult kidney and also from other samples in 
this panel. In addition, the relative overexpression of this gene in fetal tissue suggests that 
the protein product may enhance kidney growth or development in the fetus and thus may 
also act in a regenerative capacity in the adult. Therefore, therapeutic modulation of the 
protein encoded by this gene could be useful in treatment of kidney related diseases 
15 including lupus and glomerulonephritis. 

Moderate to low levels of expression of this gene is also seen in tissues with 
metabolic/endocrine functions such as pancreas, adiposes, adrenal and pituitary glands, 
heart, skeletal muscle, and gastrointestinal tract. Therefore, therapeutic modulation of the 
activity of this gene may prove useful in the treatment of endocrine/metabolically related 
20 diseases, such as obesity and diabetes. 
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Moderate to low levels of expression of this gene'Is*&ls6 £eeff 'fif ^nwnber of cancer" 
cell lines derived from colon, lung, and ovarian cancer. Therefore, therapeutic modulation 
of this gene may be useful in the treatment of colon, lung and ovarian cancers. 

Panel 1.3D Summary: Ag2078 Three experiments with same probe-primer sets 
5 are in excellent agreement. Highest expression of this gene is seen in fetal kidney 
(CTs=26-27.8), with lower expression in the adult lung. This pattern correlates to the 
expression seen in panel 1.5. Please see panel 1 .5 for further discussion of this gene. 

Panel 4D Summary: Ag2078 Highest expression of this gene is detected in 
thymus (CT=27.3). This gene or its protein product may thus play an important role in T 
10 cell development. Small molecule therapeutics, or antibody therapeutics designed against 
the protein encoded for by this gene could be utilized to modulate immune function (T cell 
development) and be important for organ transplant, AIDS treatment or post chemotherapy 
immune reconstitiution. 

Moderate to low levels of expression of this gene is also seen in lupus kidney, 
15 resting and cytokine activated mucoepidermoid NCI-H292 cells and dermal fibroblasts. 
Therefore, therapeutic modulation of this gene may be useful in the treatment of chronic 
obstructive pulmonary disease, asthma, allergy, emphysema, lupus kidney and skin 
disorders, including psoriasis. , 

Panel 5D Summary: Ag2078 Highest expression of this gene is detected in uterus 
20 and adipose of diabetic patients on insulin (CT=30.9-31). In addition, moderate to low 
levels of expression of this gene is also seen in uterus and placenta. Therefore, therapeutic 
modulation of this gene may be useful in the treatment of obesity and diabetes. 

C. CG118051-02: ALDH8 splice variant, submitted to study 
DDSMT on 09/26/01 by saguo; classification type=Finished In-silico; 
25 novelty=Update-Variants; ORF start=407, ORF stop=1436, frame=2; 

1586 bp. 

Expression of gene CGI 1805 1-02 was assessed using the primer-probe set Ag3729, 
described in Table CA. Results of the RTQ-PCR runs are shown in Tables CB and CC. 
Table CA. Probe Name Ag3729 

30 



Primers 




Length 


Start 
Position 


SEQID 
No 
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Forward 


, o 

5 ' -ttcaagaaaacaagcagcttct-3 1 






1±37 3 

258 


Probe 


TET-5 ' -cccaggacctgcataagccagct-3 
•-TAMRA 


23 


309 


259 


Reverse 


5 ' -ctcagatatgtctgcctcgaa-3 ' 


21 


332 


260 



Table CB. Pane) 2.2 



1 

Tissue Name 


Rel. 

Exp.(%) 
Ag3729, 
Run 

174441818 


Rel. 

Exp.(%) 
Ag3729, 
Run 

259034396 


Tissue Name 


Rel. 

Exp.(%) 
Ag3729, 
Run 

174441818 


Rel. 

Exp.(%) 
Ag3729, 
Run 

259034396 


Normal Colon 


0.4 


0.3 


Kidney Margin 
(OD04348) 


0.0 


0.0 


Colon cancer 
(OD06064) | 


1.4 


1.0 


Kidney malignant 
cancer 

(OD06204B) 


0.0 


0.0 


Colon Margin 

\\JLJ\J\J\J\J L rJ 


0.0 


0.0 


Kidney normal 
adjacent tissue 
(OD06204E) 


0.0 


0.0 


Colon cancer 
(OD06159) 


0.2 ! 


0.1 


Kidney Cancer 
(OD04450-01) 


0.0 


0.0 


Colon Margin 
(OD06159) 


0.0 


0.0 


Kidney Margin 
(OD04450-03) 


1.3 


0.9 


Colon cancer 
(OD06297-04) 


0.0 


0.0 


Kidney Cancer 
8120613 


0.0 


0.0 


Colon Margin 
(OD06297-05) 


0.0 


0.0 


Kidney Margin 
8120614 


0.0 


0.0 


CC Gr.2 ascend colon 
(OD03921) 


1.1 


0.8 


Kidney Cancer 
9010320 


0.5 


0.3 


CC Margin 
(OD03921) 


0.0 


0.0 


Kidney Margin 
9010321 


1.8 


1.4 


Colon cancer 

metastasis 

(OD06104) 


0.2 


0.1 


Kidney Cancer 
8120607 


0.0 


0.0 


Lung Margin 
(OD06104) 


0.0 


0.0 


Kidney Margin 
8120608 


1.0 


0.8 


Colon mets to lung 
(OD04451-01) 


0.2 


0.2 


Normal Uterus 


0.0 


0.0 


Lung Margin 
(OD04451-02) 


0.0 


0.0 


Uterine Cancer 
064011 


1.8 


1.2 


Normal Prostate 


2.3 


1.8 


Normal Thyroid 


0.0 


00 


Prostate Cancer 
(OD04410) 


2.2 


1.6 


Thyroid Cancer 
064010 


0.0 


0.0 


Prostate Margin 
(OD04410) 


5.1 


3.8 


Thyroid Cancer 
A302152 


0.0 


0.0 
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Normal Ovary 


0.7 


0.3 


Thyroid MargixT 
A302153 


0.0 


0.0 


Ovarian cancer 
(OD06283-03) 


2.5 


1.7 


Normal Breast 


9.2 


6.5 


Ovarian Margin 
(OD06283-07) 


0.0 


0.0 


Breast Cancer 
(OD04566) 


17.4 


12.9 


Ovarian Cancer 
064008 


1 A 

1.0 


A C 

U.o 


Breast Cancer 1024 


100.0 


100.0 


Ovarian cancer 
(OD06145) 


A A 

0.4 


A 

0.3 


Breast Cancer 
(OD04590-01) 


3.9 


2.5 


Ovarian Margin 
(OD06145) 


A C 

0.5 


A 1 

0.3 


Breast Cancer Mets 
(OD0459043) 


i 

1.2 


A A 

0.9 


Ovarian cancer 
(OD06455-03) 


0.9 


0.5 


Breast Cancer 

Metastasis 

(OD04655-05) 


48.6 


34.4 


Ovarian Margin 
(OD06455-07) 


0.0 


- — — - — - 
0.0 


Breast Cancer 
064006 


2.4 


2.1 


Normal Lung 


0.0 


0.0 


Breast Cancer 
9100266 


55.1 


43.8 


Invasive poor diff. 
lung adeno 
(ODO4945-01 


9.2 


7.5 


Breast Margin 
9100265 


14.7 


10.8 


Lung Margin 
(ODO4945-03) 


0.0 


0.0 


Breast Cancer 
A209073 


32.1 


24.5 


Lung Malignant 
Cancer (OD03 126) 


0.5 


0.4 


Breast Margin 
A2090734 


9.1 


6.4 


Lung Margin 
(OD03126) 


0.4 


0.3 


Breast cancer 
(OD06083) 


69.7 


61.6 


Lung Cancer 
(OD05014A) 


0.0 


0.0 


Breast cancer node 

metastasis 

(OD06083) 


28.5 


23.3 


Lung Margin 
(OD05014B) 


0.8 


0.6 


Normal Liver 


0.0 


0.0 


Lung cancer 
(OD06081) 


44.8 


0.3 


Liver Cancer 1026 


0.0 


0.0 


Lung Margin 
(OD06081) 


0.0 


0.0 


Liver Cancer 1025 


0.8 


0.6 


Lung Cancer 
(OD04237-01) 


3.1 


2.6 


Liver Cancer 
6004-T 


0.2 


0.1 


Lung Margin 
(OD04237-02) 


0.4 


0.3 


Liver Tissue 
6004-N 


0.4 


0.3 


Ocular Melanoma 
Metastasis 


0.0 


0.0 


Liver Cancer 
6005-T 


0.0 


0.0 


Ocular Melanoma 
Margin (Liver) 


0.0 


0.0 


Liver Tissue 
6005-N 


0.0 


0.0 


Melanoma Metastasis 


0.0 


0.0 


Liver Cancer 
064003 


0.0 


0.0 


Melanoma Margin 
(Lung) 


0.3 


0.2 


Normal Bladder 


0.0 \ 


0.0 



362 



WO 03/029424 



PCT/US02/31373 



Normal Kidney 


0.0 


0.0 


PCT-i 

Bladder Cancer 

1023 


3.2 


23 


Kidney Ca, Nuclear 
grade 2 (OD04338) 


1.5 1 


1.2 


Bladder Cancer 
A302173 


4.5 


3.2 


Kidney Margin 
(OD04338) 


0.4 


0.3 


Normal Stomach 


0.0 


0.0 


Kidney Ca Nuclear 
grade 1/2 (OD04339) 


0.0 


0.0 


Gastric Cancer 
9060397 


0.5 


0.3 


Kidney Margin 
(OD04339) 


0.0 


0.0 


Stomach Margin 
9060396 


2.1 


1.4 


Kidney Ca, Clear cell 
type (OD04340) 


0.0 


0.0 


Gastric Cancer 
9060395 


2.5 


1.7 


Kidney Margin 
(OD04340) 


0.4 


0.3 


Stomach Margin 
9060394 


1.8 


1.1 


Kidney Ca, Nuclear 
grade 3 (OD04348) 


0.0 


0.0 


Gastric Cancer 
064005 


0.0 


0.0 



Table CC. Panel 4.1D 

5 



Tissue Name | 


ReL 
Ep.(%) 
Ag3729, 
Run 

170222887 


Tissue Name 


ReL 

Exp.(%) 
Ag3729, 
Run 

170222887 


Secondary Thl act 


0.0 


HUVEC IL-lbeta 


0.0 


Secondary Th2 act 


0.0 


HUVEC IFN gamma 


0.0 


Secondary Trl act 


0.0 


HUVEC TNF alpha + IFN gamma 


0.0 


Secondary Thl rest 


0.0 


HUVEC TNF alpha + IL4 


0.0 


Secondary Th2 rest 


0.0 


HUVEC IL-11 


0.0 


Secondary Trl rest 


0.0 


Lung Microvascular EC none 


0.0 


Primary Thl act 


0.0 


Lung Microvascular EC TNFalpha 
+ IL-lbeta 


0.0 


Primary Th2 act 


0.0 


Microvascular Dermal EC none 


0.0 


Primary Trl act 


0.0 


Microsvasular Dermal EC 
TNFalpha + IL-lbeta 


0.0 


Primary Thl rest 


0.0 


Bronchial epithelium TNFalpha + 
ILlbeta 


26.8 


Primary Th2 rest 


0.0 


Small airway epithelium none 


25.5 


Primary Trl rest 


0.0 


Small airway epithelium TNFalpha 
+ IL-lbeta 


46.7 


CD45RA CD4 lymphocyte act 


0.0 


Coronery artery SMC rest 


0.0 


CD45RO CD4 lymphocyte act 


0.0 


Coronery artery SMC TNFalpha + 
IL-lbeta 


0.0 


CD8 lymphocyte act 


0.0 


Astrocytes rest 


0.0 


Secondary CD8 lymphocyte rest 


0.0 


Astrocytes TNFalpha + IL-lbeta 


0.0 
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Secondary CD8 lymphocyte act |(X0 


mo^n PCJ/UDOBi 

KU-812 (Basophil) rest 


0.(T 


CD4 lymphocyte none 


l/.U 


KU-812 (Basophil) 
r jvla/ i onomy c in 


0.0 


2ry Thl/Th2/Trl._anti-CD95 
CH11 


0.0 


CCD1106 (Keratinocytes) none 


0.0 


LAK cells rest 


0.0 


CCD 11 06 (Keratinocytes) 
TNFalpha + IL^lbeta 


6.7 


LAK cells IL-2 


O.U 


Liver cirrhosis 


0.0 


LAK cells IL-2+IL-12 


0.0 


NCI-H292 none 


100.0 


LAK cells IL-2+IFN gamma 


0.0 


NCI-H292 DL-4 


55.9 


LAKcellslL-2+IL-18 


0.0 


NCI-H292 IL-9 


82.9 


LAK cells PMA/ionomycin 


0.0 


NCI-H292 IL-13 


58.2 


NK Cells IL-2 rest 


0.0 


NCI-H292 IFN eamma 


60.3 


Two Way MLR 3 day 


0.0 


HPAEC none 

X XX J- A / 1 % \JM 1 w 


0.0 


Two Way MLR 5 day 


0.0 


HPAEC TNF alpha + IL-1 beta 


0.0 


Two Way MLR 7 day 


0.0 


Limp fibroblast none 


0.0 


PBMC rest 


0.0 


Lung fibroblast TNF alpha + IL-1 
beta 


0.0 


PBMC PWM 


0.0 


Lung fibroblast IL-4 


0.0 


PBMC PHA-L 


0.0 


Lung fibroblast IL-9 


0.0 


Ramos (B cell) none 


7.4 


Lung fibroblast IL-13 


0.0 


Ramos (B cell) ionomycin 


3.1 


Lung fibroblast IFN gamma 


0.0 j 


B lymphocytes PWM 


0.0 


Dermal fibroblast CCD1070 rest 


0.0 


B lymphocytes CD40L and IL-4 


0.0 


Dermal fibroblast CCD 1070 TNF 
alpha 


0.O 


POT 1 HhrAMP 


00 


tv lx._ui_.l1 ^t.^kinnt' prril fY7A TT 1 

Dermal iibroblast CCDlu/u IL^I 
beta 


0.0 


EOL-1 dbcAMP 
PMA/ionomycin 


0.0 


Dermal fibroblast IFN gamma 


0.0 


Dendritic cells none 


0.0 


Dermal fibroblast IL-4 


0.0 


Dendritic cells LPS 


0.0 


Dermal Fibroblasts rest 


0.O 


Dendritic cells anti-CD40 


0.0 


Neutrophils TNFa+LPS 


0.0 


Monocytes rest 


0.0 


Neutrophils rest 


0.0 


Monocytes LPS 


0.0 


Colon 


0.0 


Macrophages rest 


0.0 


Lung 


6.3 


Macrophages LPS 


0.0 


Thymus 


7.8 


HUVEC none 


0.0 


Kidney 


2.6 


HUVEC starved 


0.0 







CNSjneurodegeneration_vl.O Summary: Ag3729 Expression of this gene is 
low/undetectable in all samples on this panel (CTs>35). 

Panel 2.2 Summary: Ag3729 Two experiments with same probe-primer sets are in 
good agreement. Highest expression of this gene is seen in breast cancer (CTs=27-29). 
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Thus, expression of this gene could be used to differentiate bfet^d"erf1He 4 WeSst , 'caifcef' ' 
samples and other samples on this panel. 

In addition, moderate expression of this gene is also seen in cancer samples derived 
from colon, breast, ovarian, lung, bladder, kidney and uterine cancers. Interestingly, 
5 expression of gene higher cancer compared to the corresponding normal adjacent tissue. 
Thus, expression of this gene may be used as diagnostic marker to detect the presence of 
colon, breast, ovarian, lung, bladder, kidney and uterine cancers and also, therapeutic 
modulation of the expression or function of this gene may be effective in the treatment of 
these cancers. 

10 Panel 4.1D Summary: Ag3729 Expression of this gene is restricted to a few 

samples, with highest expression is seen in untreated NCI-H292 cells (CT=31.4). The gene 
is also expressed in a cluster of treated and untreated samples derived from the NCI-H292 
cell line, a human airway epithelial cell line that produces mucins. Mucus overproduction is 
an important feature of bronchial asthma and chronic obstructive pulmonary disease 

15 samples. Interestingly, the transcript is also expressed at lower but still significant levels in 
small airway and bronchial epithelium treated with BL-1 beta and TNF-alpha and untreated 
small airway epithelium. The expression of the transcript in this mucoepidcrmoid cell line 
that is often used as a model for airway epithelium (NCI-H292 cells) suggests that this 
transcript may be important in the proliferation or activation of airway epithelium. 

20 Therefore, therapeutics designed with the protein encoded by the transcript may reduce or 
eliminate symptoms caused by inflammation in lung epithelia in chronic obstructive 
pulmonary disease, asthma, allergy, and emphysema. 

D. CG140468-02: SERINE/THREONINE-PROTEIN KINASE 
PAK 1. 

25 Expression of gene CG 140468-02 was assessed using the primer-probe set Ag7054, 

described in Table DA. Results of the RTQ-PCR runs are shown in Table DB. Please note 
that CG140468-02 represents a full-length physical clone. 
Table DA. Probe Name Ac7054 

30 



Primers 


Sequence 


Length 


Start 
Position 


SEQID 
No 


Forward 


5 1 -ggtttgagaagattgccaagc-3 ' 


21 


819 


261 
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;,t/ug 



Probe 



TET-5 * -cctcactccactgattgctgcagcta 
a-3 ' -TAMRA 



27 



850 



262 



Reverse 



5 ' -ctggggtgagtgtggttttag-3 ' 



21 



898 



263 



Table DB. General screening panel vl.6 



Tissue Name 


Rel. 
Run 

282273878 


issue Name 


Rel. 

J&Xp*l w) 

Ag7054, 
Run 

282273878 


Adipose 


3.6 


Renal ca. TK-10 


10.7 


Melanoma* Hs688(A).T 


7.3 


Bladder 


9.0 


Melanoma* Hs688(B).T 


6.6 


Gastric ca. (liver met.) NCI-N87 


30.6 


Melanoma* M14 


13.3 


Gastric ca. KATOIII 


49.3 


Melanoma* LOXIMVI 


21.6 


Colon ca. SW-948 


7.8 


Melanoma* SK-MEL-5 


8.1 


Colon ca. SW480 


2.5 


Squamous cell carcinoma SCC-4 


7.7 


Colon ca * (SW480 met) SW620 


11.8 


Testis Pool 


5.6 


Colon ca. HT29 


22.2 


Prostate ca * (bone met) PC-3 


3.3 


Colon ca.HCT-1 16 


19.1 


Prostate Pool 


8.0 


Colon ca. CaCo-2 


34.6 


Placenta 


9.5 


Colon cancer tissue 


9.0 


Uterus Pool 


2.4 


Colon ca.SWl 116 


4.5 


Ovarian ca. OVCAR-3 


100.0 


Colon ca. Colo-205 


10.2 


Ovarian ca. SK-OV-3 


16.4 


Colon ca. SW-48 


8.0 


Ovarian ca. OVCAR-4 


3.3 


Colon Pool 


9.1 


Ovarian ca. OVCAR-5 


35.1 


Small Intestine Pool 


8.9 


Ovarian ca. IGROV-1 


5.3 


Stomach Pool 


5.1 4 


Ovarian ca. OVCAR-8 


8.4 


Bone Marrow Pool 


3.4 


Ovary 


5.1 


Fetal Heart 


1.5 


Breast ca. MCF-7 


2.2 


Heart Pool 


3.7 


Breast ca. MDA-MB-231 


11.8 


Lymph Node Pool 


8.3 


Breast ca. BT 549 


4.2 


Fetal Skeletal Muscle 


8.1 


Breast ca.T47D 


7.7 


Skeletal Muscle Pool 


4.3 


Breast ca. MDA-N 


5.8 


Spleen Pool 


5.1 


Breast Pool 


8.8 


Thymus Pool 


7.6 


Trachea 


7.7 


CNS cancer (glio/astro) U87-MG 


6.3 


Lung 


4.1 


CNS cancer (glio/astro) U-118-MG 


12.7 


Fetal Lung 


7.9 


CNS cancer (neuro;met) SK-N-AS 


6.2 


Lungca. NCI-N417 


7.9 


CNS cancer (astro) SF-539 


7.4 


Lung ca. LX-1 


19.9 


CNS cancer (astro) SNB-75 


14.1 


Lung ca. NCI-H146 


3.5 


CNS cancer (glio) SNB-19 


5.5 


Lung ca. SHP-77 


5.8 


CNS cancer (glio) SF-295 


5.8 
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Brain (Amyg^^^o^^ ^ ^ 



Lung ca. A549 



8.8 



24 



Lungca. NCI-H526 



3.5 



Brain (cerebellum) 



85.9 



Lung ca. NCI-H23 



11.0 



Brain (fetal) 



16.4 



Lungca. NCI-H460 



1.0 



Brain (Hippocampus) Pool 



21.2 



Lung ca.HOP-62 



3.5 



Cerebral Cortex Pool 



64.6 



Lung ca. NCI-H522 



20.7 



Brain (Substantia nigra) Pool 



27.9 



Liver 



0.7 



Brain (Thalamus) Pool 



51.8 



Fetal Liver 



9.1 



Brain (whole) 



55.5 



Liver ca. HepG2 



0.5 



Spinal Cord Pool 



5.0 



Kidney Pool 



11.3 



Adrenal Gland 



4.9 



Fetal Kidney 



16.0 



Pituitary gland Pool 



4.9 



Renal ca. 786-0 



Renal ca. A498 



9.9 
4.4 



Salivary Gland 
Thyroid (female) 



2.7 



5.8 



Renal ca. ACHN 



6.9 



Pancreatic ca. CAPAN2 



9.7 



Renal ca. UO-31 



13.5 



Pancreas Pool 



5.5 



General jscreening_panel_vl.6 Summary: Ag7054 Highest expression of this 
gene is detected in a ovarian cancer cell line (CT=25.4). Moderate levels of expression of 
5 this gene is also seen in cluster of cancer cell lines derived from pancreatic, gastric, colon, 
lung, liver, renal, breast, ovarian, prostate, squamous cell carcinoma, melanoma and brain 
cancers. Thus, expression of this gene could be used as a marker to detect the presence of 
these cancers. Furthermore, therapeutic modulation of the expression or function of this 
gene may be effective in the treatment of pancreatic, gastric, colon, lung, liver, renal, 
10 breast, ovarian, prostate, squamous cell carcinoma, melanoma and brain cancers. 

Among tissues with metabolic or endocrine function, this gene is expressed at 
moderate levels in pancreas, adipose, adrenal gland, thyroid, pituitary gland, skeletal 
muscle, heart, liver and the gastrointestinal tract. Therefore, therapeutic modulation of the 
activity of this gene may prove useful in the treatment of endocrine/metabolically related 
15 diseases, such as obesity and diabetes. 

In addition, this gene is expressed at high levels in all regions of the central nervous 
system examined, including amygdala, hippocampus, substantia nigra, thalamus, 
cerebellum, cerebral cortex, and spinal cord. Therefore, therapeutic modulation of this gene 
product may be useful in the treatment of central nervous system disorders such as 
20 Alzheimer's disease, Parkinson's disease, epilepsy, multiple sclerosis, schizophrenia and 
depression. 
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Interestingly, this gene is expressed at much highlr f^^nYetilW^8^J^M " 
compared to adult liver (CT=32.7). This observation suggests that expression of this gene 
can be used to distinguish fetal from adult liver. In addition, the relative overexpression of 
this gene in feta] tissue suggests that the protein product may enhance liver growth or 
5 development in the fetus and thus may also act in a regenerative capacity in the adult. 
Therefore, therapeutic modulation of the protein encoded by this gene could be useful in 
treatment of liver related diseases. 

E. CG142564-01: CARNITINE 
O-PALMITOYLTRANSFERASE I. 

10 Expression of gene CG 142564-01 was assessed using the primer-probe set Ag6952, 

described in Table EA. Results of the RTQ-PCR runs are shown in Table EB. Please note 
that CG142564-02 represents a full-length physical clone. 
Table EA, Probe Name Ag6952 



Primers 




Length 


Start 
Position 


SEQID 
No 


Forward 


5 ' -tctgctaccaatcccagatcc-3 ■ 


21 


434 


264 


Probe 


TET-5 • -tcgacccagagcagcacccca-3 ■ 
-TAMRA 


21 


461 


265 


Reverse 


5 * -catctgctacagggccaaag-3 1 


20 


504 


266 



Table EB. General screening panel vl.6 



Tissue Name 


Rel. 

Exp.(%) 
Ag6952, 
Run 

278388893 


issue Name 


Rel. 

Exp.(%) 
Ag6952, 
Run 

278388893 


Adipose 


4.1 


Renal ca. TK-10 


20.0 


Melanoma* Hs688(A).T 


0.8 


Bladder 


33.4 


Melanoma* Hs688(B).T 


1.2 


Gastric ca. (liver met) NCI-N87 


81.2 


Melanoma* M14 


21.8 


Gastric ca. KATO m 


8.2 


Melanoma* LOXMVI 


4.6 


Colon ca. SW-948 


5.4 


Melanoma* SK-MEL-5 


8.5 


Colon ca. SW480 


14.8 


Squamous cell carcinoma SCC-4 


1.6 


Colon ca.* (SW480 met) SW620 


17.1 


Testis Pool 


31.6 


Colon ca. HT29 


1.3 


Prostate ca.* (bone met) PC-3 


9.3 


Colon ca. HCT-1 16 


14.3 
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Prostate Pool 


5.8 


ii-K rt-u >w y b nm it~h '—n 

Colonca-Ca^ 1 -' 000 ^' 




Placenta 


8.5 


Colon cancer tissue 


7.6 


Uterus Pool 


0.7 . 


Colon ca. SW1116 


4.4 


Ovarian ca. OVCAR-3 


5.0 


Colon ca. CoIo-205 


4.7 


Ovarian ca. SK-OV-3 


50.7 


Colon ca. SW-48 


2.6 


Ovarian ca. OVCAR^J 


1.9 


Colon Pool 


3.4 


Ovarian ca. OVCAR-5 


25.3 


Small Intestine Pool 


2.9 


Ovarian ca. IGROV-1 


6.9 


Stomach Pool 


2.9 


Ovarian ca. OVCAR-8 


4.7 


Bone Marrow Pool 


1.5 


Ovary 


3.0 


Fetal Heart 


100.0 


Breast ca. MCF-7 


9.7 


Heart Pool 


42.6 


Breast ca. MDA-MB-231 


9.1 


Lymph Node Pool 


2.9 


Breast ca. BT 549 


14.3 


Fetal Skeletal Muscle 


17.9 


Breast ca. T47D 


3.3 


Skeletal Muscle Pool 


21.8 


Breast ca. MDA-N 


0.8 


S£leen Pool 


10.4 


Breast Pool 


3.1 


Thymus Pool 


17.9 


Trachea 


3.8 


CNS cancer (glio/astro) U87-MG 


12.3 


Lung 


3.0 


CNS cancer (glio/astro) U-l 18-MG 


25.3 


Fetal Lung 


7.3 


CNS cancer (neuro;rnet) SK-N-AS 


21.0 


Lungca. NCI-N417 


1.2 


CNS cancer (astro) SF-539 


2.6 


Lung ca. LX-1 


22.8 


CNS cancer (astro) SNB-75 


16.5 


Lungca. NCI-H146 


3.6 


CNS cancer (glio) SNB-19 


10.1 


Lung ca. SHP-77 


26.4 


CNS cancer (glio) SF-295 


61.1 | 


Lung ca. A549 


13.4 


Brain (Amygdala) Pool 


4.5 


Lungca. NCI-H526 


0.8 


Brain (cerebellum) 


39.0 


Lungca. NCI-H23 


13.8 


Brain (fetal) 


13.2 


Lungca. NCI-H460 


13.9 


Brain (Hippocampus) Pool 


3.6 


Lung ca. HOP-62 


32.8 


Cerebral Cortex Pool 


3.4 


Lungca. NCI-H522 


21.6 


Brain (Substantia nigra) Pool 


5.3 


Liver 


0.4 


Brain (Thalamus) Pool 


5.6 


Fetal Liver 


2.2 


Brain (whole) 


3.3 


Liver ca. HepG2 


5.0 


Spinal Cord Pool 


4.8 


Kidney Pool 


2.7 


Adrenal Gland 


6.9 


Fetal Kidney 


4.6 


Pituitary gland Pool 


3.2 


Renal ca. 786-0 


14.6 


Salivary Gland 


4.9 


Renal ca. A498 


1.8 


Thyroid (female) 


1.1 


Renal ca. ACHN 


7.6 


Pancreatic ca. CAP AN 2 


12.1 


Renal ca. UO-31 


11.9 


Pancreas Pool 


5.0 



General_screening_panel_vl.6 Summary: Ag6952 Highest expression of this 
gene is detected in fetal heart (CT=26.7). Moderate to high levels of expression of this gene 
is also seen in tissues with metabolic/endocrine functions such as pancreas, adipose, 
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adrenal gland, thyroid, pituitary gland, skeletal muscle, hdartiiM aM^S^SstfoiifteStrifaf ^ 
tract. Therefore, therapeutic modulation of the activity of this gene may prove useful in the 
treatment of endocrine/metabolically related diseases, such as obesity and diabetes. 

Moderate levels of expression of this gene is also seen in cluster of cancer cell lines 
5 derived from pancreatic, gastric, colon, lung, liver, renal, breast, ovarian, prostate, 

squamous cell carcinoma, melanoma and brain cancers. Thus, expression of this gene could 
be used as a marker to detect the presence of these cancers. Furthermore, therapeutic 
modulation of the expression or function of this gene may be effective in the treatment of 
pancreatic, gastric, colon, lung, liver, renal, breast, ovarian, prostate, squamous cell 
10 carcinoma, melanoma and brain cancers. 

In addition, this gene is expressed at moderate levels in all regions of the central 
nervous system examined, including amygdala, hippocampus, substantia nigra, thalamus, 
cerebellum, cerebral cortex, and spinal cord. Therefore, therapeutic modulation of this gene 
product may be useful in the treatment of central nervous system disorders such as 
15 Alzheimer's disease, Parkinson's disease, epilepsy, multiple sclerosis, schizophrenia and 
depression. 

F. CG142797-01: Cathepsin L like. 

Expression of gene CG142797-01 was assessed using the primer-probe set Ag7539, 
described in Table FA. 

20 Table FA. Probe Name Ag7539 



Primers 


Sequencs 


Length 


Start 
Position 


SEQID 
No 


Forward 


5 ' -ctctaacacgtgaccacagtctaga-3 ■ 


25 


68 


267 


Probe 


TET-5 ■ -tcttgtgctttgccttccacttggt- 
3 ' -TAMRA 


25 


103 


268 


Reverse 


5 ' -atcttcatgttctccatgtcatataatc-3 


28 


128 


269 



25 CNS_neurodegeneration_vl.O Summary: Ag7539 Expression of this gene is 

low/undetectable (CTs > 35) across all of the samples on this panel. 

Panel 4.1D Summary: Ag7539 Expression of this gene is low/undetectable (CTs 
> 35) across all of the samples on this panel. 
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G. CG143216-01: Diacylglycerol Kinase. 

Expression of gene CG143216-01 was assessed using the primer-probe sets Ag4554 
and Ag7230, described in Tables GA and GB. Results of the RTQ-PCR runs are shown in 
Tables GC, GD, GE and GF. 

Table GA. Probe Name Ag4554 



Primers 


Sequence 


Length 


Start 
Position 


SEQ ID 
No 


Forward 


5 1 -aatgctccaggttcaattttct-3 ' 


22 


1349 


270 


Probe 


TET-5 ' -accaaccagcaggaccagtttgactt 
-3 ' -TAMRA 


26 


1390 


271 


Reverse 


5 ' -gacgcgataaacttcaacaaaa-3 ' 


22 


1419 


272 



Table GB. Probe Name Ag7230 



Primers 


Sequence 


Length 


Start 
Position 


SEQ ID 
No 


Forward 


5 ■ -gcatatcgttgttggggact-3 1 


20 


852 


273 


Probe 


TET-5 ' -atggatgtgtcctcagtccaccacaa 
-3 » -TAMRA 


26 


880 


274 


Reverse 


5 1 -cacggagtagcgaaggagtg-3 1 


20 


911 


275 



Table GC. CNS neurodegeneration vLO 



Tissue Name 


Rel. 

Exp.(%) 
Ag4554, 
Run 

224721290 


ReL 

Exp.(%) 
Ag7230, 
Run 

288742189 


issue Name 


ReL 

Exp.(%) 
Ag4554, 
Run 

224721290 


Rel. 

Exp.(%) 
Ag7230, 
Run 

288742189 


AD 1 Hippo 


9.3 


14.1 


Control (Path) 3 
Temporal Ctx ! 


5.7 


5.3 


AD 2 Hippo 


22.2 


20.2 


Control (Path) 4 
Temporal Ctx 


20.0 


19.2 


AD 3 Hippo 


10.6 


9.7 


AD 1 Occipital Ctx 


7.3 


18.6 


AD 4 Hippo 


7.1 


5.3 


AD 2 Occipital Ctx 
(Missing) 


0.0 


0.0 


AD 5 hippo 


100.0 


100.0 


AD 3 Occipital Ctx 


11.3 


8.0 


AD 6 Hippo 


36.9 


42.0 


AD 4 Occipital Ctx 


19.8 


13.4 


Control 2 Hippo 


22.7 


23.8 


AD 5 Occipital Ctx 


15.9 


18.0 


Control 4 Hippo 


7.7 


10.2 


AD 6 Occipital Ctx 


53.2 


54.3 
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Control (Path) 3 
Hippo 


6.9 


5.2 


Control 1 Ibccfpiilal 
Ctx 


U' Ul U> 1L-. 
4.5 


jl «:!J >" « 

3.9 


AD 1 Temporal Ctx 


15.7 


18.2 


Control 2 Occipital 
Ctx 


81.8 


90.8 


AD 2 Temporal Ctx 


20.2 


20.0 


Control 3 Occipital 
Ctx 


14.4 


14.7 


AD 3 Temporal Ctx 


9.9 


8.0 


Control 4 Occipital 
Ctx 


6.4 


6.8 


AD 4 Temporal Ctx 


18.8 


9.8 


Control (Path) 1 
Occipital Ctx 


45.4 


57.8 


AD 5 Inf Temporal 
Ctx 


97.9 


81.2 


Control (Path) 2 
Occipital Ctx 


6.1 


6.1 


AD 5 SupTemporal 
Ctx 


31.6 


36.3 


Control (Path) 3 
Occipital Ctx 


5.1 


5.2 


AD 6 Inf Temporal 
Ctx 


26.2 


28.9 


Control (Path) 4 
Occipital Ctx 


12.6 


12.8 


AD 6 Sup Temporal 
Ctx 


29.1 


33.7 


Control 1 Parietal 
Ctx 


6.4 


5.7 


Control 1 Temporal 
Ctx 


9.5 


5.1 


Control 2 Parietal 
Ctx 


26.4 


26.4 


Control 2 Temporal 
Ctx 


39.0 


43.2 


Control 3 Parietal 
Ctx 


18.0 


19.6 


Control 3 Temporal 
Ctx 


10.1 


11.4 


Control (Path) 1 
Parietal Ctx 


56.3 


70.7 


Control 4 Temporal 
Ctx 


6.6 


6.7 


Control (Path) 2 
Parietal Ctx 


15.7 


15.2 


Control (Path) 1 
Temporal Ctx 


32.8 


35.1 


Control (Path) 3 
Parietal Ctx 


5.5 


5.1 


Control (Path) 2 
Temporal Ctx 


20.4 


22.8 


Control (Path) 4 
Parietal Ctx 


41.5 


36.3 



Table GD. General screening panel vl.4 



Tissue Name 


Rel. 

Exp.(%) 
Ag4554, 
Run 

222809973 


issue Name 


ReL 

Exp.(%) 
Ag4554, 
Run 

222809973 


Adipose 


5.4 


Renal ca. TK-10 


34.6 


Melanoma* Hs688(A).T 


45.1 


Bladder 


15.8 


Melanoma* Hs688(B).T 


45.1 


Gastric ca. (liver met) NCI-N87 


21.3 


Melanoma* M14 


85.9 


Gastric ca. KATO HI 


84.1 


Melanoma* LOXIMVI 


21.9 


Colon ca.SW-948 


0.7 


Melanoma* SK-MEL-5 


69.7 


Colon ca.SW480 


52.5 


Squamous cell carcinoma SCC-4 


26.8 1 


Colon ca * (SW480 met) SW620 


27.0 


Testis Pool 


6.8 


Colon ca. HT29 


12.5 
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Prostate ca.* (bone met) PC-3 


29.9 


Colon ca. iBfclfr U!jDd/ 


*J •"**»' ^ -• 


Prostate Pool 


6.9 


Colon ca. CaCo-2 


25.5 


Placenta 


5.7 


Colon cancer tissue 


24.1 


Uterus Pool 


4.8 


Colon ca.SW1116 


8.5 


Ovarian ca. OVCAR-3 


14.9 


Colon ca. Colo-205 


12.9 


Ovarian ca. SK-OV-3 


100.0 


Colon ca. SW-48 


6.5 


Ovarian ca. OVCAR-4 


10.2 


Colon Pool 


15.0 


Ovarian ca. OVCAR-5 


36.1 


Small Intestine Pool 


17.8 


Ovarian ca. IGROV-1 


20.3 


Stomach Pool 


9.0 


Ovarian ca. OVCAR-8 


16.0 


Bone Marrow Pool 


5.0 


Ovary 


15.0 


Fetal Heart 


23.5 


Breast ca. MCF-7 


16.5 


Heart Pool 


12.2 


Breast ca. MDA-MB-231 


51.1 


Lymph Node Pool 


15.1 


Breast ca. BT 549 


47.3 


Fetal Skeletal Muscle 


4.6 


Breast ca. T47D 


62.0 


Skeletal Muscle Pool 


12.0 


Breast ca. MDA-N 


17.8 


Spleen Pool 


10.7 


Breast Pool 


12.5 


Thymus Pool 


26.2 


Trachea 


12.3 


CNS cancer (glio/astro) U87-MG 


65.1 


Lung 


1.2 


CNS cancer (glio/astro) U-118-MG 


79.0 


Fetal Lung 


27.4 


CNS cancer (neuro;met) SK-N-AS 


48.6 


Lungca. NCI-N417 


8.0 


CNS cancer (astro) SF-539 


23.3 


Lung ca. LX-1 


52.1 


CNS cancer (astro) SNB-75 


89.5 


Lungca. NCI-H146 


22.5 


CNS cancer (glio) SNB-19 


21.8 


Lung ca. SHP-77 


97.9 


CNS cancer (glio) SF-295 


63.7 


Lung ca. A549 


25.0 


Brain (Amygdala) Pool 


14.8 


Lung ca. NCI-H526 


0.0 


Brain (cerebellum) 


90.8 


Lung ca. NCI-H23 


45.1 


Brain (fetal) 


30.4 


Lung ca. NCI-H460 


15.9 


Brain (Hippocampus) Pool 


15.0 


Lung ca. HOP-62 


27.4 


Cerebral Cortex Pool 


29.3 


Lung ca. NCI-H522 


27.9 


Brain (Substantia nigra) Pool 


31.2 


Liver 


3.7 


Brain (Thalamus) Pool 


27.7 


Fetal Liver 


12.0 


Brain (whole) 


29.3 


Liver ca. HepG2 


28.1 


Spinal Cord Pool 


11.8 


Kidney Pool 


25.0 


Adrenal Gland 


29.1 


Fetal Kidney 


13.7 


Pituitary gland Pool 


24.8 


Renal ca. 786-0 


24.0 


Salivary Gland 


11.6 


Renal ca, A498 


4.5 


Thyroid (female) 


11.5 


Renal ca. ACHN 


6.3 


Pancreatic ca, CAPAN2 


10.4 


Renal ca.UO-31 


18.8 


Pancreas Pool 


21.8 
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Table GE. Panel 4.1D 



Tissue Name 


Rel. 

Exp.(%) 
Ag4554, 

Dun 

kuu 

199319739 


Rel. 

Exp.(%) 
Ag7230, 
Kun 

288211134 


Tissue Name 


Rel. 

Exp.(%) 
Ag4554, 
Run 

199319739 


Rel. 

Exp.(%) 
Ag7230, 
Run 

288211134 


Secondary Thl act 


70.2 


48.3 


HUVEC IL-lbeta 


62.9 


38.4 


Secondary Th2 act 


44.8 


30.4 


HUVEC IFN gamma 


50.3 


35.1 


Secondary Trl act 


64.2 


17.8 


HUVECTNF alpha + 
IFN gamma 




14 ft 


Secondary Thl rest 


17.7 


6.7 


HUVECTNF alpha + 
IL4 


43.2 


13.1 


Secondary Th2 rest 


22.4 


6.6 


HUVEC IL-11 


38.2 


16.7 


ocLunuary in rest 


1*7 A 


o.U 


Lung Microvascular 
EC none 


100.0 


100.0 


Primary Thl act 


27.7 


6.0 


Lung Microvascular 
EC TNFalpha + 
IL-lbeta 


82.4 


42.0 


r nmary i nz act 


423 


O A O 

24.8 


Microvascular 
Dermal EC none 


40.3 


9.7 


Primary Trl act 


39.5 


31.4 


Microsvasular 
Dermal EC 
TNFalpha + BL-lbeta 


28.3 


7.1 


Primary Thl rest 


17.2 


12.2 


Bronchial epithelium 
TNFalpha -flLlbeta 


17 7 


J.O 


Primary Th2 rest 


11.0 


10.1 


Small airway 
epithelium none 




D.O 


Primary Trl rest 


39.2 


1.2 


Small airway 
epithelium TNFalpha 
+ IL-lbeta 


11.4 


6.6 


CD45RA CD4 
lymphocyte act 


39.8 


18.7 


Coronery artery SMC 
rest 


24.8 


14.1 


CD45RO CD4 
lymphocyte act j 


44.4 


31.4 


Coronery artery SMC 
TNFalpha + IL-lbeta 


24.7 


17.0 


CD8 lymphocyte act 


41.2 


10.8 


Astrocytes rest 


11.7 


10.2 


Secondary CD8 
lymphocyte rest 


43.5 


9.9 


Astrocytes TNFalpha 
+ IL-lbeta 


7.8 




Secondary CD8 
lymphocyte act 


11.2 


4.4 


KU-812 (Basophil) 
rest 


5.8 


4.3 


CD4 lymphocyte none 


19.2 


5.0 


KU-812 (Basophil) 
PMA/ionomycin 


7.9 


5.7 


2ry 

Thl/Th2/Trl_anti-CD95 
CH11 


40.9 


11.2 


CCD1106 

(Keratinocytes) none 


14.6 


13.4 


LAK cells rest 


21.0 


8.0 


CCD1106 
(Keratinocytes) 
rNFalpha + IL-lbeta 


5.7 


2.1 
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LAK cells BL-2 


23.0 


13.0 


B" 1 ' teP B A h 

Liver cirrhosis 


3.0 


4.0 


LAK cells IL-2+EL-12 


12.7 


1.5 


NCI-H292none 


3.4 


7.5 


LAK cells IL-2+IFN 
gamma 


14.6 


5.6 


NCI-H292 IL-4 


7.1 


8.0 


LAK cells IL-2+ IL-18 


18.7 


7.7 


NCI-H292 IL-9 


9.7 


6.6 


LAK cells 
PMA/ionomycin 


23.8 


14.3 


NCI-H292IL-13 


10.7 


6.3 


NK Cells IL-2 rest 


42.9 


35.8 


NCI-H292 IFN 
gamma 


3.2 


1.5 


1 wo Way MLR 3 day 


22.5 


9.9 


HPAEC none 


31.0 


13.9 


Two Way MLR 5 day 


20.9 


3.3 


HPAJSC TNF alpha + 
EL-1 beta 


52.5 


31.9 


iwo Way MLR 7 day 


21.2 


10.2 


Lung fibroblast none 


16.0 


7.7 


PBMC rest 


12.0 


6.8 


Lung fibroblast TNF 
alpha + DL-1 beta 


16.8 


9.6 


PBMC PWM 


19.3 


5.1 


Lung fibroblast IL-4 


16.3 


7.6 


PBMC PHA-L 


29.9 


14.4 


Lung fibroblast EL-9 


23.2 


11.4 


Ramos (B cell) none 


19.3 


6.5 


Lung fibroblast IL-13 


13.8 


7.0 


Ramos (B cell) 
ionomycin 


2L3 


13.7 


Lung fibroblast IFN 
gamma 


7.1 


6.1 


B lymphocytes PWM 


18.2 


9.9 


Dermal fibroblast 
CCD 1070 rest 


22.7 


36.6 


B lymphocytes CD40L 
and IL-4 


26.4 


25.7 


Dermal fibroblast 
CCD1070 TNF alpha 


63.7 


59.5 


EOL-1 dbcAMP 


29.3 


26.2 


Dermal fibroblast 
CCD1070 IL-1 beta 


29.9 


19.3 


EOL-1 dbcAMP 
PMA/ionomycin 




7 5 


Dermal fibroblast 
IFN gamma 


/-U 


5.6 


Dendritic cells none 


28.9 


17.6 


Dermal fibroblast 
IL-4 


20.6 


12.9 


Dendritic cells LPS 


9.0 


2.8 


Dermal Fibroblasts 
rest 


15.2 


20.7 


Dendritic cells 
anti-CD40 


40.6 


8.3 


Neutrophils 
TNFa+LPS 


18.4 


16.0 


Monocytes rest 


20.7 


7.6 


Neutrophils rest 


16.3 


20.6 


Monocytes LPS 


18.2 


15.7 


Colon 


14.1 


3.9 


Macrophages rest 


20.0 


8.2 


Lung 


9.9 


2.6 


Macrophages LPS 


4.0 


2.0 


Thymus 


39.2 


7.4 


HUVEC none 


57.8 


31.9 


Kidney 


18.8 


11.6 


HUVEC starved 


64.2 


50.0 
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Table GF. Panel 5 Islet 



Tissue Name 


Rel. 
Exp.0 
Ag4554, 
Run 

306350410 


Tissue Name 


Rel. 

Exp.(%) 
Ag4554, 
Run 

306350410 


97457 J>atient-02go_adipose 


5.0 


94709 JDonor 2 AM - A_adipose 


20.3 


97476J > atient-07sk_skeletal 
muscle 


0.0 


947 1 0 JDonor 2 AM - B_adipose 


12.8 


97477 J , atient-U7ut_uterus 




s/4 / 1 i__JL?onor z /\iyi - c__aaipose 




9747 8 -- ratient-U7pl_pl acenta 


Z.O 


?h / 1 z_jL/onor z /vL' - A_aaipose 




99 167JJayer Patient 1 


1 A A A 
1UU.0 


y4 / 1 D_iJoxiOX l iWJ - o__aaipose 




97482 JPatient-08uUiterus 


2.4 


947 14 JDonor 2 AD - C_adipose 


17.3 


97483 JPatient-08pLpiacenta 


1.9 


94742 JDonor 3 U - Ajvlesenchymal 
otem ceils 


10.0 


97486J > atient-09sk^skeletal 
muscle 


3.4 


94743_Donor 3 U - B Jdesenchymal 
otem ceils 


9.7 


97487 JPauent-09ut_uterus 


3.4 


V4 / 3U_JL/onor 3 AM - A^aaipose 


zy.i 


97488_JPatient-09pLplacenta 


0.9 


94731_Donor 3 AM - B_adipose 


47.0 


97492 JPatient-10ut_uterus 


5.6 


94732 JDonor 3 AM - C_adipose 


33.9 


97493„Patient-10pLplacenta 


6.0 


94733 JDonor 3 AD - A^adipose 


46.3 


97495 JPatient-1 lgo_adipose 


4.7 


94734_Donor 3 AD - B_adipose 


72.7 


97496_J > atieiit-l lsk_skeletal 
muscle 


3.4 


94735 JDonor 3 AD - C_adipose 


13.7 


97497 JPatient-1 1 ut_uterus 


6.0 


77138JLiver JHepG2untreated 


41.5 


y / 49o_rauent- 1 1 pi_piacenta 


Z.U 


73556 JHeart_Cardiac stromal cells 
(primary) 


8 5 

O.J 


97500 JPatient-12go_adipose 


8.7 


81735_Small Intestine 


18.0 


97501 JPatient-12sk_slce]etal 
muscle 


14.2 


72409 JKianey_Proximal Convoluted 
Tubule 


9.3 


97502 J>atient-12ut_uterus 


12.3 


82685_Small intestine JDuodenum 


20.2 


97503_PatienM2pLp)acenta 


3.5 


90650_AdrenaLAdrenocortical 
adenoma 


10.1 


94721 J)onor2U- 
Ajvlesenchymal Stem Cells 


21.6 


72410_Kidney_HRCE 


16.8 


94722 JDonor2U- 

B Jvlesenchymal Stem Cells 


6.3 


72411JKidneyJHRE 


6.8 


94723_Donor 2 U - 
CJdesenchyrnal Stem Cells 


20.2 


73139JJtenisJJterine smooth 
muscle cells 


19.5 



5 

CNS_neurodegeneration_vl.O Summary: Ag4554/Ag7230 Two experiments 
with different probe-primer sets are in excellent agreement. This panel confirms the 
expression of this gene at low levels in the brains of an independent group of individuals. 



376 



WO 03/029424 



PCT/US02/31373 



However, no differential expression of this gene was detected befween^lzKelfhef ? ' 
diseased postmortem brains and those of non-demented controls in this experiment. Please 
see Panel 1.4 for a discussion of this gene in treatment of central nervous system disorders. 
GeneraLscreening_panel_vl.4 Summary: Ag4554 Highest expression of this 

5 gene is detected in a ovarian cancer cell line (CT=25.4). Moderate levels of expression of 
this gene is also seen in cluster of cancer cell lines derived from pancreatic, gastric, colon, 
lung, liver, renal, breast, ovarian, prostate, squamous cell carcinoma, melanoma and brain 
cancers. Thus, expression of this gene could be used as a marker to detect the presence of 
these cancers. Furthermore, therapeutic modulation of the expression or function of this 

10 gene may be effective in the treatment of pancreatic, gastric, colon, lung, liver, renal, 
breast, ovarian, prostate, squamous cell carcinoma, melanoma and brain cancers. 

Among tissues with metabolic or endocrine function, this gene is expressed at 
moderate levels in pancreas, adipose, adrenal gland, thyroid, pituitary gland, skeletal 
muscle, heart, liver and the gastrointestinal tract. Therefore, therapeutic modulation of the 

15 activity of this gene may prove useful in the treatment of endocrine/metabolically related 
diseases, such as obesity and diabetes. 

In addition, this gene is expressed at high levels in all regions of the central nervous 
system examined, including amygdala, hippocampus, substantia nigra, thalamus, 
cerebellum, cerebral cortex, and spinal cord. Therefore, therapeutic modulation of this gene 

20 product may be useful in the treatment of central nervous system disorders such as 

Alzheimer's disease, Parkinson's disease, epilepsy, multiple sclerosis, schizophrenia and 
depression. 

Interestingly, this gene is expressed at much higher levels in fetal (CT=27.3) when 
compared to adult lung (CT=31.8). This observation suggests that expression of this gene 

25 can be used to distinguish fetal from adult lung. In addition, the relative overexpression of 
this gene in fetal tissue suggests that the protein product may enhance lung growth or 
development in the fetus and thus may also act in a regenerative capacity in the adult. 
Therefore, therapeutic modulation of the protein encoded by this gene could be useful in 
treatment of lung related diseases. 

30 Panel 4.1D Summary: Ag4554/Ag7230 Two experiments with different 

probe-primer sets are in excellent agreement. Highest expression of this gene is detected in 
lung microvascular endothelial cells (CTs=28-29). This gene is expressed at high to 
moderate levels in a wide range of cell types of significance in the immune response in 
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health and disease. These cells include members of the T-c&l B-ceBf dtftteliarfefr,* 
macrophage/monocyte, and peripheral blood mononuclear cell family, as well as epithelial 
and fibroblast cell types from lung and skin, and normal tissues represented by colon, lung, 
thymus and kidney. This ubiquitous pattern of expression suggests that this gene product 
5 may be involved in homeostatic processes for these and other cell types and tissues. This 
pattern is in agreement with the expression profile in General_screening_paneLvl.4 and 
also suggests a role for the gene product in cell survival and proliferation. Therefore, 
modulation of the gene product with a functional therapeutic may lead to the alteration of 
functions associated with these cell types and lead to improvement of the symptoms of 
10 patients suffering from autoimmune and inflammatory diseases such as asthma, allergies, 
inflammatory bowel disease, lupus erythematosus, psoriasis, rheumatoid arthritis, and 
osteoarthritis. 

Panel 5 Islet Summary: Ag4554 Highest expression of this gene is detected in 
islet cells (CT=29.8). This gene shows a widespread expression pattern which correlates 
15 with the pattern seen in panel 1 A. Please see panel 1 .4 for further discussion of this gene. 

H. CG143787-01: Disintegrin Protease. 

Expression of gene CG143787-01 was assessed using the primer-probe sets 
Ag6532, Ag6655 and Ag7048, described in Tables HA, HB and HC. Please note that 
CG143787-01 represents a full-length physical clone. 
20 Table HA. Probe Name Ag6532 



Primers 


Sequencs 


Length 


Start 
Position 


SEQID 
No 


Forward 


5 1 -atcatcaccaaagataccttttatctc-3 ■ 


27 


474 


276 


Probe 


TET-5 ' -agaaaccaaagtgcctgctgcaagc- 
3 ' -TAMRA 


25 


501 


277 


Reverse 


5 ' -gtgttgtcattatatttgtaggaataggt- 
3' 


29 


526 


278 


Table HB. Probe Name A&6655 


Primers 


Seqoenes 


Length 


Start 
Position 


SEQID 
No 


Forward 


5 ' -atcatcaccaaagataccttttatctc-3 ' 


27 


474 


279 
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Probe 


1 ■ — HR 

TET-5 ' -agaaaccaaagtgcctgctgcaagc- 

3 ' -TAMRA 


CTVU6 
25 


501 


1337] 

280 


Reverse 


5 1 -gtgttgtcattatatttgtaggaataggt- 
3' 


29 


526 


281 


Table HC. Probe Name Ac7048 


Primers 


Sequence 


Length 


Start 
Position 


SEQID 
No 


Forward 


5 ' -acatcatcaccaaagatacctttta-3 ■ 


25 j 


472 


282 


Probe 


TET-5 ' -caaagtgcctgctgcaagcacctatt 
-3 1 -TAMRA 


26 


507 


283 


Reverse 


5 ' -gttcccacacactggtgttg-3 ' 


20 


549 


284 



GeneraLscreening_paneLvL6 Summary: Ag6655/Ag7048 Expression of this 
gene is low/undetectable (CTs > 35) across all of the samples on this panel. 

Panel 4.1D Summary: Ag6655 Expression of this gene is low/undetectable (CTs 
> 35) across all of the samples on this panel. 

I. CG144112-01: NEUROPSENf PRECURSOR. 

Expression of gene CG1441 12-01 was assessed using the primer-probe set Ag7123, 
described in Table IA. Please note that CG56663-01 represents a full-length physical clone. 
Table IA. Probe Name Ag7123 



Primers 


Sequencs 


Length 


Start 
Position 


SEQID 
No 


Forward 


5 1 -gcctgggcaggaaatacac-3 1 


19 


353 


285 


Probe 


TET-5 • -tacgcctgggagaccacagcctacag 
-3 ' -TAMRA 


26 


325 


286 


Reverse 


5 1 -tctcggggactgcacttct-3 ' 


19 


292 


287 



CNSjeurodegenerationjvl.O Summary: Ag7123 Expression of this gene is 
low/undetectable (CTs > 35) across all of the samples on this panel. 

Panel 4.1D Summary: Ag7123 Expression of this gene is low/undetectable (CTs 
> 35) across all of the samples on this panel. 
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J. CG1441 12-04: KaJlikrein-8. 

Expression of gene CG1441 12-04 was assessed using the primer-probe set Ag5271, 
described in Table JA. 

Table JA. Probe Name Ag5271 



Primers 


Sequence 


Length 


Start 
Position 


SEQID 
No 


Forward 


5 1 -gcagggcagggcgattct-3 ' 


18 


97 


288 


Probe 


TET-5 1 -cacatcctggggctcagacccctgtg 
-3 ' -TAMRA 


26 


153 


289 


Reverse 


5 ■ -ctagaatcagcccttgctgccta-3 • 


23 


245 


290 



CNSjneurodegeneration_vl.O Summary: Ag5271 Expression of this gene is 
low/undetectable (CTs > 35) across all of the samples on this panel. 

Panel 4.1D Summary: Ag5271 Expression of this gene is low/undetectable (CTs 
> 35) across all of the samples on this panel. 

K. CG144686-01: MAST CELL C ARB OX YPEPTID ASE A 
PRECURSOR. 

Expression of gene CG144686-01 was assessed using the primer-probe set Ag6864, 
described in Table KA. Results of the RTQ-PCR runs axe shown in Tables KB and KC. 
Please note that CG 144686-01 represents a full-length physical clone. 

Table KA. Probe Name Ag6864 



Primers 


Sequencs 


Length 


Start 
Position 


SEQID 
No 


Forward 


5 ' -aaccagtgagctccgaga-3 ' 


18 


122 


291 


Probe 


TET-5 1 -caaatttggttttctccttccagaatc 
c-3 1 -TAMRA 


28 


146 


292 


Reverse 


5 • -tctgcacgttggctttat-3 1 


18 


177 


293 
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Table KB. General screening panel vl.6 



Tissue Name ' 


Rel. 

T? vn i OL\ 

n-xp.v /o ) 
Ag6864, 
Run 

278387547 


issue Name 


ReL 

KXp,\<fo) 

Ag6864, 
Run 

278387547 


Adipose 


15.0 


Renal ca. TK-10 


0.0 


Melanoma* Hs688(A).T 


0.3 


Bladder 


0.0 


Melanoma* Hs688(B).T 


0.7 


Gastric ca. (liver met.) NCI-N87 


0.0 


Melanoma* M14 


0.0 


Gastric ca. KATO III 


0.0 


Melanoma* LOXIMVI 


0.0 


Colon ca. SW-948 


0.0 


Melanoma* SK-MEL-5 ! 


0.0 


Colon ca. SW480 


0.0 


Squamous cell carcinoma SCC-4 


0.0 


Colon ca * (SW480 met) SW620 


0.0 


Testis Pool 


7.6 


Colon ca. HT29 


0.0 


Prostate ca.* (bone met) PC-3 


0.0 


Colon ca. HCT-116 


0.0 


Prostate Pool 


16.4 


Colon ca. CaCo-2 


0.0 


Placenta 


0.1 


Colon cancer tissue 


70.7 


Uterus Pool 


15.8 


Colon ca. SW1116 


0.0 


Ovarian ca. OVCAR-3 


0.0 


Colon ca. Colo-205 


0.0 


Ovarian ca. SK-OV-3 


0.0 


Colon ca. SW-48 


0.0 


Ovarian ca. OVCAR-4 


0.0 


Colon Pool 


78.5 


Ovarian ca. OVCAR-5 


0.0 


Small Intestine Pool 


0.0 


Ovarian ca. IGROV-1 


0.0 


Stomach Pool 


20.0 


Ovarian ca. OVCAR-8 


0.0 


Bone Marrow Pool 


23.2 


Ovary 


2.5 


Fetal Heart 


4.6 


Breast ca. MCF-7 


0.0 


Heart Pool 


20.0 


Breast ca. MDA-MB-231 


0.0 


Lymph Node Pool 


100.0 


Breast ca. BT 549 


0.7 


Fetal Skeletal Muscle 


5.5 


Breast ca. T47D 


0.0 


Skeletal Muscle Pool 


1.5 


Breast ca. MDA-N 


0.0 


Spleen Pool 


3.0 


Breast Pool 


0.0 


Thymus Pool 


18.2 


Trachea 


2.5 


CNS cancer (glio/astro) U87-MG 


0.0 


Lung 


2.7 


CNS cancer (glio/astro) U-l 18-MG 


1.8 


Fetal Lung 


5.3 


CNS cancer (neuro;met) SK-N-AS 


0.0 


Lung ca. NCI-N417 


0.0 


CNS cancer (astro) SF-539 


0.0 


Lung ca. LX-1 


0.0 


CNS cancer (astro) SNB-75 


0.0 


Lungca.NCI-H146 


0.0 


CNS cancer (glio) SNB-19 


0.0 


Lung ca. SHP-77 


4.5 


CNS cancer (glio) SF-295 


0.0 


Lung ca. A549 


0.0 


Brain (Amygdala) Pool 


0.0 


Lungca.NCI-H526 


0.0 


Brain (cerebellum) 


O.O 


Lung ca. NCI-H23 


0.0 


Brain (fetal) 


0.0 


Lungca. NCI-H460 


0.0 


Brain (Hippocampus) Pool 


0.0 


Lungca.HOP-62 


0.9 


Cerebral Cortex Pool 


0.0 
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cT/yooo 



Lung ca. NCI-H522 



0.0 



Brain (Substantia" nigra) 



Liver 



Fetal Liver 



0.0 



Brain (Thalamus) Pool 



6.0 



Brain (whole) 



0.0 



0.0 



Liver ca. HepG2 



0.0 



Spinal Cord Pool 



0.0 



Kidney Pool 



51.4 



Adrenal Gland 



0.7 



Fetal Kidney 



1.1 



Pituitary gland Pool 



1.0 



Renal ca. 786-0 



0.2 



Salivary Gland 



0.0 



Renal ca. A498 



0.0 



Thyroid (female) 



0.2 



Renal ca. ACHN 



0.0 



Pancreatic ca. CAPAN2 



0.0 



Renal ca.UO-31 



0.2 



Pancreas Pool 



10.4 



Table KC. Panel 5 Islet 



Tissue Name 


ReL 

Exp.(%) 

Ag6864 

Run 

30542485 

Q 

o 


Rel. : 
Exp.(%) 
Ag6864, 
Run 

30765049 

Q 

o 


Tissue Name 


Rel. 

Exp.(%) 
Ag6864, 
Run 
3054248 

Do 


Rel. 

Exp.(%) 

Ag6864, 

Run 

3076504 

yo 


97457__Patient-02go_adipos 

Q 


5.5 


34.9 


94709J)onor2AM- 
A adipose 


0.0 


0.0 


97476_Patient-07sk u- skeleta 
1 muscle 


0.0 


0.0 


94710J>onor2AM- 
B_adipose 


0.0 


0.0 


97477 JPatient-07utjiterus 


1.4 


32.1 


94711_Donor2AM- 
C_adipose 


0.0 


0.0 


97478^Patient-07pl_placent 
a 


0.0 


4.7 


94712J)onor2AD- 
A_adipose 


0.0 


0.0 


99167_Bayer Patient 1 


0.0 


0.0 


94713JDonor2AD- 
B_adipose 


0.0 


0.0 


97482 JPatient-08ut_uterus 


0.0 


0.0 


947 14 JDonor 2 AJD - 
C_adipose 


2.3 


0.0 


97483_Patient-08pLplacent 
a 


0.0 


0.0 


94742 JDonor 3 U - 

A Jdesenchymal Stem Cells 


0.0 


0.0 


97486J > atient-09sk w skeleta 
1 muscle 


7.6 


15.5 


94743_Donor 3 U - 

B Jvlesenchymal Stem Cells 


0.0 


0.0 


97487_Patient-09ut„uterus 


28.7 


11.2 


94730 JDonor 3 AM - 
A^adipose 


0.0 


0.0 


97488_Patient-09pl_placent 
a 


1.4 


0.0 


94731 JDonor 3 AM - 
B_adipose 


0.0 


1.9 


97492 J>atient-10ut_irterus 


10.4 


7.2 


94732_JDonor 3 AM - 
C_adipose 


0.0 


0.0 


97493_Patient-10pLplacent 
a 


0.0 


5.9 


94733 JDonor 3 AD - 
A_adipose 


0.0 


0.0 


97495 _Patient-l lgo_adipos 
e 


20.0 


5.0 


94734 JDonor 3 AD - 
B_adipose 


0.0 


0.0 
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97496JPatient-l lsk_skeleta 
1 muscle 


6.0 


8.7 


1 prx .'"11 jHCfrfi 

94735J)on&r3~Af)^ 0w,Ht 

C_adipose 


0.0 


0.0 


97497_Patient-l luLuterus 


45.1 


65.1 


77138JLiver HepG2untreate 
d 


0.0 


0.0 


97498JPatient-l lpLplacent 
a 


0.0 


0.0 


73556_Heart__Cardiac stromal 
cells (primary) 


5.1 


3.2 


97500 JPatient- 1 2go_adipos 
e 


59.9 


59.9 


81735_Small Intestine 


73.2 


65.1 


9750LPatient-l 2sleskeleta 
1 muscle 


100.0 


100.0 


72409 JfcdneyJProximal 
Convoluted Tubule 


n a 
u.u 


0.0 


97502JPatient-12ut_uterus 


29.1 


97.3 

- - ■ .I 


82685_Small 
intestine Duodenum 


59.0 


67.4 


97503_Patient-12pLplacent 
a 


5.0 


2.3 


90650_Adrenal_Adrenocortic 
al adenoma 


0.0 


0.0 


94721_Donor2U- 
A_Mesenchymal Stem 
Cells 


0.0 


0.0 


72410JGdney_HRCE 


0.0 


0.0 


94722 JDonor2U- 
B_Mesenchymal Stem 
Cells 


0.0 


0.0 


72411_KidneyjraE 


0.0 


0.0 


94723_Donor2U- 
CLMesenchymal Stem 
Cells 


1.5 


0.0 


73139_Uterus_Uterine 
smooth muscle cells 


0.0 


0.0 



GeneraLscreenin^paneLvl.6 Summary: Ag6864 Highest expression of this 
gene is seen in lymph node (CT=29). Moderate levels of expression are also seen 
predominantly in normal tissue, including adipose, colon, heart, thymus, prostate, and 
kidney, as well as in colon cancer tissue. Thus, expression of this gene could be used to 
identify these samples arid tissues. Modulation of the expression of this gene may also be 
effective in the treatment of diseases of these tissues, including cancer, obesity and 
diabetes. 

Panel 5 Islet Summary: Ag6864 Two experiments with the same probe and 
primer produce results that are in excellent agreement. Highest expression of this gene is 
seen in skeletal muscle (CTs=33.5). Please see Panel 1.6 for discussion of this gene. 

L. CG144906-01: TESTISIN PRECURSOR. 

Expression of gene CG144906-01 was assessed using the primer-probe set Ag6915, 
described in Table LA. Please note that CG144906-01 represents a full-length physical 
clone. 



383 



WO 03/029424 



PCTYUS02/31373 



Table LA. Probe Name Ag6915 



Primers 


Sequence 


Length 


Start 
Position 


SEQID 
No 


Forward 


5 1 -catgccatcctccacattt-3 ' 


19 


337 


294 


Probe 


TET-5 1 -cagcagtctgtccggttctcaaactc 
-3 » -TAMRA 


26 


356 


295 


Reverse 


5 ' -gtgcctcatcctctttgatgta-3 1 


22 


398 


296 



5 

GeneraLscreening_panel_vl.6 Summary: Ag6915 Expression of this gene is 
low/undetectable (CTs > 35) across all of the samples on this panel. 

M. CG144997-01: RNase H I. 

Expression of gene CG144997-01 was assessed using the primer-probe set Ag7057, 
10 described in Table MA. Results of the RTQ-PCR runs are shown in Table MB. Please note 
that CG144997-01 represents a full-length physical clone. 
Table MA. Probe Name Ag7057 



Primers 




Length 


Start 
Position 


SEQID 
No 


Forward 


5 ' -gtaaacgccgattcctgct-3 1 


19 


468 


297 


Probe 


TET-5 * -cttctacgcccattactggagcagca 
-3 ' -TAMRA 


26 


493 


298 


Reverse 


5 1 -gaatgagtgcagagacacgttt-3 ' 


22 


558 


299 



15 

Table MB. General , screening panel vl.6 



Tissue Name 


Rel. 

Exp.(%) 
Ag7057, 
Run 

282273884 


issue Name 


Rel. 

Exp.(%) 
Ag7057, 
Run 

282273884 


Adipose 


3.9 


Renal ca. TK-10 


33.9 


Melanoma* Hs688(A).T 


23.8 


Bladder 


15.7 


Melanoma* Hs688(B).T 


28.3 


Gastric ca. (liver met.) NCI-N87 


49.0 


Melanoma* M14 


50.7 


Gastric ca.KATO III 


100.0 


Melanoma* LOXMVI 


57.8 


Colon ca. S W-948 


11.4 


Melanoma* SK-MEL-5 


51.4 


Colon ca.SW480 


76.3 


Squamous cell carcinoma SCC-4 


22.5 


Colon ca.* (SW480 met) S W620 


34.9 
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Testis Pool 


9.0 


Colon ca.HTf9 lU 11 ' 


15 o 


3 


Prostate ca * (bone met) PC-3 


60.3 


Colon ca. HCT-116 


36.6 


Prostate Pool 


5.4 


Colon ca. CaCo-2 


42.0 


Placenta 


4.5 


Colon cancer tissue 


17.6 


Uterus Pool 


1.9 


Colon ca. SW1116 


5.4 


Ovarian ca. OVCAR-3 


31.2 


Colon ca. Colo-205 


10.4 


Ovarian ca. SK-OV-3 


31.4 


Colon ca. SW-48 


6.8 


Ovarian ca. OVCAR-4 


17.1 


Colon Pool 


9.5 


Ovarian ca. OVCAR-5 


39.0 


Small Intestine Pool 


5.7 


Ovarian ca.IGROV-1 


13.3 


Stomach Pool 


5.1 


Ovarian ca. OVCAR-8 


15.0 


Bone Marrow Pool 


3.3 


Ovary 


4.9 


Fetal Heart 


4.7 


Breast ca. MCF-7 


21.8 


Heart Pool 


4.5 


Breast ca. MDA-MB-231 


17.3 


Lymph Node Pool 


8.9 


Breast ca. BT 549 


24.8 


Fetal Skeletal Muscle 


4.0 


Breast ca. T47D 


9.5 


Skeletal Muscle Pool |2.3 


Breast ca. MDA-N J22.7 


Spleen Pool |4.1 


Breast Pool 


12.3 


Thymus Pool 


8.2 


Trachea 


7.3 


CNS cancer (glio/astro) U87-MG 


55.5 


Lung 


1.9 


CNS cancer (glio/astro) U-118-MG 


49.7 


Fetal Lung 


8.6 


CNS cancer (neuro;met) SK-N-AS 


49.7 


Lung ca.NCI-N417 


10.1 


CNS cancer (astro) SF-539 


22.1 


Lung ca.LX-1 


22.4 


CNS cancer (astro) SNB-75 


45.1 


Lung ca. NCI-H146 


11.9 


CNS cancer (glio) SNB-19 


16.7 


Lung ca. SHP-77 


82.9 


CNS cancer (glio) SF-295 


56.6 


Lung ca. A549 


54.0 


Brain (Amygdala) Pool 


7.3 


Lung ca. NCI-H526 


8.9 


Brain (cerebellum) 


20.0 


Lung ca. NCI-H23 


37.9 


Brain (fetal) 


8.0 


Lung ca. NC1-H460 


37.1 


Brain (Hippocampus) Pool 


8.1 


Lung ca. HOP-62 


12.1 


Cerebral Cortex Pool 


12.0 


Lungca.NCI-H522 


56.6 


Brain (Substantia nigra) Pool 


6.7 


Liver 


0.8 


Brain (Thalamus) Pool 


12.1 


Fetal Liver 


6.7 


Brain (whole) 


7.1 




Liver ca. HepG2 


18.6 


Spinal Cord Pool 


6.7 




Kidney Pool 


10.8 


Adrenal Gland 


6.9 




Fetal Kidney 


5.8 


Pituitary gland Pool 


2.9 




Renal ca. 786-0 


21.6 


Salivary Gland 


2.6 




Renal ca. A498 


17.1 


Thyroid (female) 


2.5 




Renal ca. ACHN 


17.6 


Pancreatic ca. CAPAN2 


23.3 




Renal ca. UO-31 


18.0 


Pancreas Pool 


6.0 





General jscreeningjpanel_vl.6 Summary: Ag7057 Highest expression of this 
gene is detected in a gastric cancer cell line (CT=27). Moderate levels of expression of this 
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gene is also seen in cluster of cancer cell lines derived frdm^aricfeatfc^pSflrt^ critafylilrfg," 3 
liver, renal, breast, ovarian, prostate, squamous cell carcinoma, melanoma and brain 
cancers. Thus, expression of this gene could be used as a marker to detect the presence of 
these cancers. Furthermore, therapeutic modulation of the expression or function of this 
5 gene may be effective in the treatment of pancreatic, gastric, colon, lung, liver, renal, 
breast, ovarian, prostate, squamous cell carcinoma, melanoma and brain cancers. 

Among tissues with metabolic or endocrine function, this gene is expressed at 
moderate levels in pancreas, adipose, adrenal gland, thyroid, pituitary gland, skeletal 
muscle, heart, liver and the gastrointestinal tract. Therefore, therapeutic modulation of the 
10 activity of this gene may prove useful in the treatment of endocrine/metabolically related 
diseases, such as obesity and diabetes. 

In addition, this gene is expressed at moderate levels in all regions of the central 
nervous system examined, including amygdala, hippocampus, substantia nigra, thalamus, 
cerebellum, cerebral cortex, and spinal cord. Therefore, therapeutic modulation of this gene 
15 product may be useful in the treatment of central nervous system disorders such as 

Alzheimer's disease, Parkinson's disease, epilepsy, multiple sclerosis, schizophrenia and 
depression. 

N. CG145494-01: PRESTIN. 

Expression of gene CG145494-01 was assessed using the primer-probe sets 
20 Ag6694, Ag7803 and Ag7797, described in Tables NA, NB and NC. Results of the 
RTQ-PCR runs are shown in Table ND. 

Table NA. Probe Name Ag6694 



Primers 


Sequeces 


Length 


Start 
Position 


SEQ ID 

No 


Forward 


5 ' -ggcacagaggccagagat-3 ' 


18 


559 


300 


Probe 


TET-5 • -gtgaccttactttcaggaatcattcagt 
tttgc-3 » -TAMRA 


33 


604 


301 


Reverse 


5 ' -ggctctgtgagatatatggcc-3 ' 


21 


663 


302 



25 



386 



WO 03/029424 



PCT/US02/31373 



Table NB. Probe Name Ag7803 



Primers 


Sequencs 


Length 


Start 
Position 


SEQID 
No 


Forward 


5 1 -ggagaaccagcaaaatagagct-3 ' 


22 


1367 


303 


Probe 


TET-5 1 -ccaatcccaggaacaaggaggacaca 
a-3 1 -TAMRA 


27 


1409 


304 


Reverse 


5 ' -atcacagcagtgatcaaacca-3 1 


21 


1440 


305 



5 

Table NC. Probe Name Ag7797 



Primers 


Sequenes 


Length 


Start 
Position 


SEQ ID 
No 


Forward 


5 ' -ccatctggcttaccacttttg-3 ' 


21 


1391 


306 


Probe 


TET-5 ' -cacagcagtgatcaaaccatagtccaa 
tcc-3 ' -TAMRA 


30 


1429 


307 


Reverse 


5 * -aaatcacagtcagcagagcaat-3 1 


22 


1462 


308 



10 

Table ND. General screening panel vl.6 



Tissue Name 


Rel. 

Exp.(%) 
Ag6694, 
Run 

277223811 


issue Name 


Rel. 

Exp.(%) 
Ag6694, 
Run 

277223811 


Adipose 


0.0 


Renal ca. TK-10 


0.0 


Melanoma* Hs688(A).T 


0.0 


Bladder 


0.0 


Melanoma* Hs688(B).T 


0.0 


Gastric ca. (liver met) NCI-N87 


0.0 


Melanoma* M14 


6.0 


Gastric ca. KATO III 


0.0 


Melanoma* LOXIMVI 


0.0 


Colon ca. SW-948 


0.0 


Melanoma* SK-MEL-5 


0.0 


Colon ca. SW480 


0.0 


Squamous cell carcinoma SCC-4 


0.0 


Colon ca * (SW480 met) SW620 


0.0 


Testis Pool 


0.0 


Colon ca. HT29 


0.0 


Prostate ca.* (bone met) PC-3 


100.0 


Colon ca.HCT-1 16 


0.0 


Prostate Pool 


0.9 


Colon ca. CaCo-2 


0.0 


Placenta 


0.0 


Colon cancer tissue 


0.0 


Uterus Pool 


0.0 


Colon ca.SW1116 


0.0 


Ovarian ca. OVCAR-3 


0.0 


Colon ca. Colo-205 


00 


Ovarian ca. SK-OV-3 


0.0 


Colon ca. SW-48 


0.0 


Ovarian ca. OVCAR-4 


0.0 


Colon Pooi 


0.0 


Ovarian ca. OVCAR-5 


0.0 


Small Intestine Pool 


0.0 
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Ovarian ca. IGROV-1 


0.0 




1111. _(> NUll, IIML HU 


Ovarian ca, OVCAR-8 


0.0 ! 


Bone Marrow Pool 


0.0 


Ovary 


0.0 


Fetai Heart 5 


0.0 


Breast ca. MCF-7 


6.6 


Heart Pool 


0.0 


Breast ca. MDA-MB-231 


0.0 


Lymph Node Pool 


0.0 


Breast ca. BT 549 


0.0 


Fetal Skeletal Muscle 


0.0 


Breast ca. T47D 


0.0 


Skeletal Muscle Pool 


0.0 


Breast ca. MDA-N 


0.0 


Spleen Pool 


0.0 


Breast Pool 


0.0 


Thymus Pool 


0.0 


Trachea 


1.0 


CNS cancer (glio/astro) U87-MG 


0.0 


Lung 


0.0 


CNS cancer (glio/astro) U-118-MG 


0.0 


Fetal Lung 


2.9 


CNS cancer (neuro;met) SK-N-AS 


0.0 


Lungca. NCI-N417 


0.0 


CNS cancer (astro) SF-539 


0.0 


Lung ca. LX-1 


0.0 


CNS cancer (astro) SNB-75 


0.0 


Lungca. NCI-H146 


0.0 


CNS cancer (glio) SNB-19 


0.0 


Lung ca. SHP-77 


0.0 


CNS cancer (glio) SF-295 


0.0 


Lung ca. A549 


0.0 


Brain (Amygdala) Pool 


0.0 


Lung ca. NCI-H526 


0.0 


Brain (cerebellum) 


14.6 


Lungca. NCI-H23 


0.0 


Brain (fetal) 


0.0 


Lung ca. NCI-H460 


0.0 


Brain (Hippocampus) Pool 


0.0 


Lung ca. HOP-62 


0.0 


Cerebral Cortex Pool 


0.0 


Lungca.NCI-H522 


0.0 


Brain (Substantia nigra) Pool 


0.0 


Liver 


0.0 


Brain (Thalamus) Pool 


0.0 


Fetal Liver 


0.0 


Brain (whole) 


0.0 


Liver ca. HepG2 


0.0 


Spinal Cord Pool 


0.0 


Kidney Pool 


0.0 


Adrenal Gland 


0.0 


Fetal Kidney 


0.0 


Pituitary gland Pool 


0.0 


Renal ca. 786-0 


0.0 


Salivary Gland 


0.0 


Renal ca. A498 


0.0 


Thyroid (female) 


0.0 


Renal ca. ACHN 


0.0 


Pancreatic ca. CAPAN2 


0.0 


Renal ca.UO-31 


0.0 


Pancreas Pool 


0.0 



CNS_neurodegeneration_vl.O Summary: Ag7797 Expression of this gene is 
low/undetectable (CTs > 34.7) across all of the samples on this panel. 
5 General_screening_panel_vl.6 Summary: Ag6694 Moderate level of expression 

of this gene is restricted to prostate cancer cell line (CT=32.6). Therefore, expression of 
this gene may be used to distinguish this sample from other samples in this panel and also 
as diagnostic marker to detect the presence of prostate cancer. Li addition, therapeutic 
modulation of this gene may be useful in the treatment of prostate cancer. 
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Panel 4.1D Summary: Ag7803 Expression of this |erie is l^/uWaeteetaHle^eW -=* 
> 35) across all of the samples on this panel. 

O. CG145722-01: WEEWike protein kinase. 

Expression of gene CG145722-01 was assessed using the primer-probe set Ag6231, 
5 described in Table OA. Results of the RTQ-PCR runs are shown in Table OB. 

Table OA. Probe Name Ag6231 



Primers 


Sequence 


Length 


Start 
Position 


SEQID 
No 


Forward 


5 ' -gcttcctggctaatgagatttt-3 1 


22 


1339 


309 


Probe 


TET-5 ' -agaggattaccggcaccttcccaaag 
-3 ' -TAMRA 


26 


1364 


310 


Reverse 


5 ' -tgttaatcccaaggcaaatatg-3 1 


22 


1394 


311 



10 

Table OB. General screening panel vl.5 



Tissue Name 


Rel. 

Exp.(%) 
Ag6231, 
Run 

259211049 


issue Name 


Rel. 

Exp.(%) 
Ag6231, 
Run 

259211049 


Adipose 


0.0 


Renal ca. TK-10 


0.0 


Melanoma* Hs688(A).T 


0.0 


Bladder 


0.0 


Melanoma* Hs688(B).T 


0.0 


Gastric ca. (liver met.) NCI-N87 


0.0 


Melanoma* M14 


0.0 


Gastric ca, KATO IH 


0.0 


Melanoma* LOXIMVI 


0.0 


Colon ca. SW-948 


0.0 


Melanoma* SK-MEL-5 


0.0 


Colon ca. SW480 


0.0 


Squamous cell carcinoma SCC-4 


0.0 


Colon ca.* (S W480 met) SW620 


0.0 


Testis Pool 


0.0 


Colon ca. HT29 


0.0 


Prostate ca.* (bone met) PC-3 


0.0 


Colon ca. HCT-116 


0.0 


Prostate Pool 


0.0 


Colon ca. CaCo-2 


97.3 


Placenta 


0.0 


Colon cancer tissue 


0.0 


Uterus Pool 


0.0 


Colon ca. SW1116 


0.0 


Ovarian ca. OVCAR-3 


0.0 


Colon ca. Colo-205 


o.o 


Ovarian ca. SK-OV-3 


0.0 


Colon ca. SW-48 


0.0 


Ovarian ca. OVCAR-4 


0.0 


Colon Pool 


0.0 


Ovarian ca. OVCAR-5 ! 


0.0 


Small Intestine Pool 


0.0 


Ovarian ca. IGROV-1 


0.0 


Stomach Pool 


0.0 


Ovarian ca. OVCAR-8 




Bone Marrow Pool 


b.o 
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Ovary 


0.0 


Fetal Hcart 12,lCT/y SOE/ 




Breast ca. MCF-7 


0.0 


Heart Pool 


0.0 


Breast ca. MDA-MB-231 


0.0 


Lymph Node Pool 


0.0 


Breast ca. BT 549 


0.0 


Fetal Skeletal Muscle 


0.0 


Breast ca. T47D 


0.0 


Skeletal Muscle Pool 


0.0 


Breast ca. MDA-N 


0.0 


Spleen Pool 


0.0 


Breast Pool 


0.0 


Thymus Pool 


0.0 


Trachea 


0.0 


CNS cancer (glio/astro) U87-MG 


0.0 


Lung 


0.0 


CNS cancer (glio/astro) U-l 18-MG 


0.0 


Fetal Lung 


0.0 


CNS cancer (neuro;met) SK-N-AS 


0.0 


Lungca. NCI-N417 


0.0 


CNS cancer (astro) SF-539 


0.0 


Lung ca. LX-1 


0.0 


CNS cancer (astro) SNB-75 


4.2 


Lung ca. NCI-H146 


100.0 


CNS cancer (glio) SNB-19 


0.0 


Lung ca. SHP-77 


2.3 


CNS cancer (glio) SF-295 


0.0 


Lung ca. A549 


0.0 


Brain (Amygdala) Pool 


2.3 


Lung ca. NCI-H526 


0.0 


Brain (cerebellum) 


5.6 


Lung ca. NQ-H23 


0.0 


Brain (fetal) 


2.6 


Lung ca. NCI-H460 


0.0 


Brain (Hippocampus) Pool 


0.0 


Lung ca. HOP-62 


0.0 


Cerebral Cortex Pool 


0.0 


Lung ca. NCI-H522 


0.0 


Brain (Substantia nigra) Pool 


0.0 


Liver 


0.0 


Brain (Thalamus) Pool 


0.0 


Fetal Liver 


0.0 


Brain (whole) > 


3.7 


Liver ca. HepG2 


0.0 


Spinal Cord Pool 


0.0 


Kidney Pool 


1.8 


Adrenal Gland 


0.0 


Fetal Kidney 


2.2 


Pituitary gland Pool 


0.0 


Renal ca. 786-0 


0.0 


Salivary Gland 


0.0 


Renal ca. A498 


0.0 


Thyroid (female) 


0.0 


Renal ca. ACHN 


0.0 


Pancreatic ca. CAPAN2 


4.6 


Renal ca. UO-31 


6.0 


Pancreas Pool 


0.0 



CNS_neurodegeneration_vl.O Summary: Ag6231 Expression of this gene is 
low/undetectable (CTs > 35) across all of the samples on this panel. 

5 General jscreeningjpaneLyl.5 Summary: Ag6231 Low levels of expression of 

this gene is restricted to a lung cancer and a colon cancer cell lines (CTs=32.2). Therefore, 
expression of this gene may be used to distinguish these cell lines from other samples in 
this panel and also as diagnostic marker to detect the presence of colon and lung cancers. In 
addition, therapeutic modulation of this gene may be useful in the treatment of these 

10 cancers. 

Panel 4.1D Summary: Ag6231 Expression of this gene is low/undetectable (CTs 
> 35) across all of the samples on this panel. 
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P. CG145754-02: KALLIKREIN 7 PR^M^ 1 5 ° S 3 * 3 3> 3 

Expression of gene CG145754-02 was assessed using the primer-probe set Ag7038, 
described in Table PA. Results of the RTQ-PCR runs are shown in Tables PB and PC. 
Please note that CG145754-02 represents a full-length physical clone. 
5 Table PA. Probe Name Ae7038 



Primers 


Sequence 


Length 


Start 
Position 


SEQID 
No 


Forward 


5 ' -tgttaatgacctcaagctcatctc-3 ' 


24 


342 


312 


Probe 


TET-5 1 -ccccaggactgcacgaaggtttacaa 
-3 ' -TAMRA 


26 


367 


313 


Reverse 


5 ' -tttcttggagtcggggatg-3 ' 


19 


426 


314 



Table PB. General screening panel vl.6 



Tissue Name 


ReL 

Exp.(%) 
Ag7038, 
Run 

282273672 


issue Name 


ReL 

Exp.(%) 
Ag7038, 
Run 

282273672 


Adipose 


1.6 


Renal ca. TK-10 


0.0 


Melanoma* Hs688(A).T 


0.0 


Bladder 


0.0 


Melanoma* Hs688(B).T 


0.0 


Gastric ca. (liver met.) NCI-N87 


100.0 


Melanoma* M14 


0.0 


Gastric ca. KATO HI 


22.1 


Melanoma* LOXIMVI 


0.0 


Colon ca. SW-948 


4.4 


Melanoma* SK-MEL-5 


0.0 


Colon ca. SW480 


10.5 1 


Squamous ceU carcinoma SCC-4 


3.0 


Colon ca.* (SW480 met) SW620 


o.o 


Testis Pool 


0.0 


Colon ca. HT29 


0.0 


Prostate ca.* (bone met) PC-3 


0.0 


Colon ca. HCT-116 


9.7 


Prostate Pool 


0.0 


Colon ca. CaCo-2 


0.0 


Placenta 


0.0 


Colon cancer tissue 


0.6 


Uterus Pool 


0.0 


Colon ca. SW1116 


38.7 


Ovarian ca. OVCAR-3 


4.1 


Colon ca. Colo-205 


0.0 


Ovarian ca. SK-OV-3 


0.0 


Colon ca. SW^8 


0.0 


Ovarian ca. OVCAR-4 


3.1 


Colon Pool 


0.0 


Ovarian ca. OVCAR-5 


0.0 


Small Intestine Pool 


0.0 


Ovarian ca. IGROV-1 


0.0 


Stomach Pool 


0.0 


Ovarian ca. OVCAR-8 


0.0 


Bone Marrow Pool 


0.0 


Ovary 


0.0 


Fetal Heart 


O.O 


Breast ca. MCF-7 


0.0 (Heart Pool 


O.O 
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Breast ca. MDA-MB-231 


0.0 


Lymph NodSpb r/L1!j,0a; 




Breast ca. BT 549 


0.0 


Fetal Skeletal Muscle 


0.0 


Breast ca. T47D 


0.0 


Skeletal Muscle Pool 


0.0 


Breast ca. MDA-N 


0.0 


Spleen Pool 


6.0 


Breast Pool 


0.0 


Thymus Pool 


0.0 


Trachea 


0.0 


CNS cancer (glio/astro) U87-MG 


0.0 


Lung 


0.0 


CNS cancer (glio/astxo) U-l 18-MG 


0.0 1 


Fetal Lung 


0.0 


CNS cancer (neuro;met) SK-N-AS 


0.0 


Lung ca. NCI-N417 


0.0 


CNS cancer (astro) SF-539 


0.0 


Lung ca. LX-1 


0.5 


CNS cancer (astro) SNB-75 


2.0 


Lung ca. NCI-H146 


0.0 


CNS cancer (g!io)SNB-19 


6.6 


Lung ca. SHP-77 


0.0 


CNS cancer (glio)SF-295 


0.0 


Lung ca. A549 


0.0 


Brain (Amygdala) Pool 


1.5 


Lung ca. NCI-H526 


0.0 


Brain (cerebellum) 


5.6 


Lung ca. NCI-H23 


4.2 


Brain (fetal) 


0.0 


Lung ca. NCI-H460 


0.0 


Brain (Hippocampus) Pool 


4.0 


Lung ca. HOP-62 


0.0 


Cerebral Cortex Pool 


3.1 


Lung ca. NCI-H522 


0.0 


Brain (Substantia nigra) Pool 


1.4 ~] 


Liver 


0.0 


Brain (Thalamus) Pool 


3.9 


Fetal Liver 


0.0 


Brain (whole) 


6.2 


Liver ca. HepG2 


0.0 


Spinal Cord Pool 


0.3 


Kidney Pool 


0.0 


Adrenal Gland 


6.0 


Fetal Kidney 


1.3 


Pituitary gland Pool 


0.0 


Renal ca. 786-0 


0.0 


Salivary Gland 


0.0 


Renal ca. A498 


0.6 


Thyroid (female) 


O.O 


Renal ca. ACHN 


0.0 


Pancreatic ca. CAPAN2 


2.2 | 


Renal ca.UO-31 


0.0 


Pancreas Pool 


O.O 



Table PC. Panel 5 Islet 



5 



Tissue Name 


Rel. 

Exp.(%) 

Ag703, 

Run 

305424861 


Tissue Name 


Rel. 

Exp.(%) 
Ag7038, 
Run 

305424861 


97457 JPatient-02go_adipose 


3.0 


94709JDonor 2 AM - A_adipose 


0.0 


97476 JPatient-07sk_skeletal 
muscle 


0.0 


94710JDonor 2 AM - B_adipose 


100.0 


97477J > atient-07ut.uterus 


0.0 


9471 1 Donor 2 AM - C adipose 


0.0 


97478_Patient-07pLplacenta 


0.0 


94712J)onor 2 AD - A^adipose 


0.0 


99167 JBayer Patient 1 


0.0 


94713 Donor 2 AD -B adipose 


0.0 


97482JPatient-08ut_uterus 


0.0 


94714 Donor 2 AD -C adipose 


0.0 
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97483_Patient-08pl_placenta 


0.0 


0/1749 rirtnriR^fr ^^iJf^^SrHlife^i 

y** /*fz — .uonor o u - /\_wiesencnyniai 
Stem Cells 


[31373 
13.0 


97486_Patient-09sk_skeletal 
muscle 


0.0 


94743 JDonor 3 U - BJtfesenchymal 
Stem Cells 


5.5 


97487 J>atient-09ut_uterus 


0.0 


94730_Donor 3 AM - A_adipose 


0.0 


97488J > atient-09pI_pIacenta 


0.0 


94731_Donor 3 AM - B_adipose 


0.0 


97492 PatienNlOut uterus 

«* *~J*»^J^ OlitJll X Will. UlWl Ud 


\J.\J 


yn / j^^L/viiQi j /vivi ** v<_auipose 




9749*3 Patient-10nl nlarpntn 


00 


?h f jo^jjouot j ax/ - /\_a<npose 


(J.U 


97495_Patient-l lgo_adipose 


2.7 


94734_J>onor 3 AD - B_adipose 


0.0 


97496_Patient-l 1 sk_skeletal 


0.0 


94735 JDonor 3 AD - C_adipose 


0.0 


97497J>atient-l lut_uterus 


0.0 


77 1 38 JLi ver_HepG2imtreated 


0.0 


97498_Patient-l lpl jplacenta 


0.0 


73556_Heaft_Cardiac stromal cells 
(primary) 


0.0 


9750O_Patient- 1 2go_adipose 


1.5 


81735_Small Intestine 


0.0 


97501 J>atient-12sk_skeletal 
muscle 


0.0 


72409_KidneyJProximal Convoluted 
Tubule 


2.4 


97502 JPatient-12ut_uterus 


1.0 


82685_Small intestineJDuodenum 


0.0 


97503 JPatienM2pl_placenta 


0.0 


90650^AdrenaLAdrenocortical 
adenoma 


0.0 


94721_Donor2U- 
AJtfesenchymal Stem Cells 


0.0 


72410JCidneyJHRCE 


5.7 


94722_Donor2U- 
B31esenchymal Stem Cells 


0.0 


72411JtidneyJKRE 


10.2 


94723_Donor2U- 
C_MesenchymaI Stem Cells 


0.0 


73139_UterusJUterine smooth 
muscle cells 


0.0 



General_screening_paneLvl.6 Summary: Ag7038 Highest expression of this 
gene is detected in a gastric cancer NCI-N87 cell line (CT=31.3). Expression of this gene 
seems to be restricted to number of colon and gastric cancer cell lines. Therefore, 
expression of this gene may be used to distinguish colon and gastric cancer cell lines from 
other samples in this panel and also as a diagnostic marker to detect the presence of colon 
and gastric cancers. In addition, therapeutic modulation of this gene may be useful in the 
treatment of colon and gastric cancer. 

Panel 5 Islet Summary: Ag7038 Low levels of expression of this gene is 
restricted to adipose tissue (CT=33). Therefore, expression of this gene may be used to 
distinguish this adipose sample from other samples in this panel. In addition, therapeutic 
modulation of this gene may be useful in the treatment of metabolic diseases such as 
obesity and diabetes. 
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Another experiment (Run 307650500) with this iRl&ici^S9iS^cP * 3 7 3 
low/undetectable (CTs > 35) across all of the samples on this panel. 

Q. CG145754-03: KalIikrein-7. 

Expression of gene CG145754-03 was assessed using the primer-probe set Ag5272, 
5 described in Table QA. Results of the RTQ-PCR runs are shown in Table QB. 
Table OA. Probe Name Ag5272 



Primers 




Length 


Start 
Position 


SEQID 
No 


Forward 


5 ' -ggcagccaggggtgacaa-3 » 


18 


119 


315 


Probe 


TET-5 1 -cgccccatgtgcaagaggctccc-3 
'-TAMRA 


23 


149 


316 


Reverse 


5 ' ~cctccgcagtggagctgatt-3 1 


20 


201 


317 



10 

Table OB. Panel 4.1D 



Tissue Name 


Rel. 
Ep.(%) 
Ag5272, 
Run 

230500478 


Tissue Name 


Rel. 

Exp.(%) 
Ag5272, 
Run 

230500478 


Secondary Thl act 


0.0 


HUVEC IL-lbeta 


0.0 


Secondary Th2 act 


0.0 


HUVEC IFN gamma 


0.0 


Secondary Trl act 


0.0 


HUVEC TNF alpha + IFN gamma 


0.0 


Secondary Thl rest 


0.0 


HUVEC TNF alpha + TL4 


0.0 


Secondary Th2 rest 


0.0 


HUVEC IL-11 


o.o 4 


Secondary Trl rest 


0.0 


Lung Microvascular EC none 


0.0 


Primary Thl act 


0.0 


Lung Microvascular EC TNFalpha 
+ IL-lbeta 


0.0 


Primary Th2 act 


0.0 


Microvascular Dermal EC none 


0.0 


Primary Trl act 


0.0 


Microsvasular Dermal EC 
TNFalpha + IL-lbeta 


0.0 


Primary Thl rest 


0.0 


Bronchial epithelium TNFalpha + 
ILlbeta 


1.3 


Primary Th2 rest 


0.0 


Small airway epithelium none 


100.0 


Primary Trl rest 


0.0 


Small airway epithelium TNFalpha 
+ IL-lbeta 


46.7 


CD45RA CD4 lymphocyte act 


0.6 


Coronery artery SMC rest 


0.0 


CD45RO CD4 lymphocyte act 


0.0 


Coronery artery SMC TNFalpha + 
IL-lbeta 


0.0 
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CD8 lymphocyte act 


0.0 


Astrocytes 


^1373 


Secondary CD8 lymphocyte rest 


0.0 


Astrocytes TNFalpha + IL-lbeta 


0.0 


Secondary CD8 lymphocyte act 


0.0 


KU-812 (Basophil) rest 


0.0 


CD4 lymphocyte none 


0.0 


KU-812 (Basophil) 

rlYl/V luiiuiiiy CIO 


1.2 


CH11 


0.0 


CCD1 106 (Keratinocytes) none 


14.2 


LAK cells rest 


0.0 


CCD1 106 (Keratinocytes) 
TNFalpha + IL-lbeta 


4.5 


LAX cells EL-2 


0.0 


Liver cirrhosis 


0.0 


LAK cells DL-2+IL-12 


0.0 


NCI-H292 none 


0.0 


LAK cells IL-2+IFN gamma 


0.0 


NCI-H292 IL^ 


0.0 


LAK cells IL-2+ DL-18 


0.0 


NCI-H292 IL-9 


0.0 


LAK cells PMA/ionomycin 


0.0 


NCI-H292 EL-13 


0.6 


NK Cells IL-2 rest 


0.0 


NCI-H292 IFN gamma 


0.0 


Two Way MLR 3 day 


0.0 


HPAJEC none 


0.0 ! 


Two Way MLR 5 day 


0.0 


HPAEC TNF alpha + IL-1 beta 


0.0 


Two Wav MLR 7 dav 


0.0 


Lung fibroblast none 


0.0 i 


PBMCrest 


0.0 


Lung fibroblast TNF alpha + EL-1 
beta 


0.0 


PBMCPWM 


0.0 


Lung fibroblast IL-4 


0.0 j 


rJBMC rHA-L 


u.u 


JLung TiDroDiast uu-y 




Ramos (B cell) none 


0.0 


Lung fibroblast IL-13 


0.0 


Ramos (B cell) ionomycin 


0.0 


Lung fibroblast IFN gamma 


0.0 


B lymphocytes PWM 


0.0 


Dermal fibroblast CCD1070 rest 


0.0 


B lymphocytes CD40L and IL-4 


0.0 


Dermal fibroblast CCD1070 TNF 
alpha 


0.0 


EOL-1 dbcAMP 


0.0 


juermai iioroDiasi i^lxjiu fv iu-i 
beta 


0.0 


EOL-1 dbcAMP 
PMA/ionomycin 


0.0 


Dermal fibroblast IFN gamma 


0.0 


Dendritic cells none 


0.0 


Dermal fibroblast IL-4 


0.0 


Dendritic cells LPS 


0.5 


Dermal Fibroblasts rest 


0.0 


Dendritic cells anti-CD40 


0.0 


Neutrophils TNFa+LPS 


0.0 


Monocytes rest 


0.0 


Neutrophils rest 


0.0 


Monocytes LPS 


0.0 


Colon 


0.0 


Macrophages rest 


0.0 


Lung 


0.0 


Macrophages LPS 


0.0 


Thymus 


0.0 


HUVEC none 


0.0 


Kidney 


11.2 


HUVEC starved 


0.0 







Panel 4.1D Summary: Ag5272 Highest expression of this gene is seen in resting 
small airway epithelium (CT=32). Significant expression of this gene is also seen in 
5 cytokines TNF-a and IL-lb treated small airway epithelium. Therefore, modulation of the 
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expression or activity of the protein encoded by this transcript fhfoifgjfth'^d^plicafidh of jr - 
small molecule therapeutics may be useful in the treatment of asthma, COPD, and 
emphysema. 

R. CG146279-01: Potassium channel subfamily K member 10. 

5 Expression of gene CG 146279-01 was assessed using the primer-probe set Ag6035, 

described in Table RA. Results of the RTQ-PCR runs are shown in Tables RB, RC, RD and 
RE. 

Table RA. Probe Name Ag6035 



Primers 




Length 


Start 
Position 


SEQID 
No 


Forward 


5 1 -atgaaatttccaatcgagacg-3 1 


21 


61 


318 


Probe 


TET-5 1 -ctaaagtggccgttcccgcagc-3 1 
-TAMRA 


22 


107 


319 


Reverse 


5 ' -ggggttgcccgttagtg-3 ' 


17 


156 


320 



Table RB. CNS neurodegeneration vl.0 

15 



Tissue Name 


Rel. 

Exp.(%) 
Ag6035, 
Run 

225246892 


issue Name 


Rel. 

Exp.(%) 
Ag6035, 
Run 

225246892 


AD 1 Hippo 


22.5 


Control (Path) 3 Temporal Ctx 


9.9 


AD 2 Hippo 


25.9 


Control (Path) 4 Temporal Ctx 


38.2 


AD 3 Hippo 


12.4 


AD 1 Occipital Ctx 


22.2 


AD 4 Hippo 


13.5 


AD 2 Occipital Ctx (Missing) 


0.0 


AD 5 Hippo 


82.9 


AD 3 Occipital Ctx 


5.3 


AD 6 Hippo 


74.2 


AD 4 Occipital Ctx 


35.4 


Control 2 Hippo 


21.5 


AD 5 Occipital Ctx 


40.9 


Control 4 Hippo 


193 


AD 6 Occipital Ctx 


17.7 


Control (Path) 3 Hippo 


8.2 


Control 1 Occipital Ctx 


4.8 


AD 1 Temporal Ctx 


24.3 


Control 2 Occipital Ctx 


53.2 


AD 2 Temporal Ctx 


43.8 


Control 3 Occipital Ctx 


39.2 


AD 3 Temporal Ctx 


4.5 


Control 4 Occipital Ctx 


8.2 


AD 4 Temporal Ctx 


36.6 


Control (Path) 1 Occipital Ctx 


88.3 


AD 5 Inf Temporal Ctx 


100.0 


Control (Path) 2 Occipital Ctx 


7.1 


AD 5 Sup Temporal Ctx 


62.0 


Control (Path) 3 Occipital Ctx 


2.5 
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AD 6 Inf Temporal Ctx 


74.7 


Control (P^kjyftH 2 '" 


***ti u|[ * 1 **!t ***'!p '""h 

ja JL-J 


AD 6 Sup Temporal Ctx 


65.1 


Control 1 Parietal Ctx 


8.9 


Control 1 Temporal Ctx 


5.8 


Control 2 Parietal Ctx 


77.4 


Control 2 Temporal Ctx 


29.5 


Control 3 Parietal Ctx 


17.1 


Control 3 Temporal Ctx 


22.7 


Control (Path) 1 Parietal Ctx 


77.9 


Control 3 Temporal Ctx 


22.7 


Control (Path) 2 Parietal Ctx 


22.4 


Control (Path) 1 Temporal Ctx 


74.2 


Control (Path) 3 Parietal Ctx 


6.3 


Control (Path) 2 Temporal Ctx 


47.0 


Control (Path) 4 Parietal Ctx 


51.4 



Table RC. General screening panel vl.5 



Tissue Name 


ReL 

Exp.(%) 
Ago035, 
Run 

228763481 


issue Name 


ReL 

Exp.(%) 
Ag6035, 
Run 

228763481 


Adipose 


0.5 


Renal ca. TK-10 


9.3 


Melanoma* Hs688(A).T 


0.0 


Bladder 


2.6 


Melanoma* Hs688(B).T 


0.0 


Gastric ca. (liver met.) NC1-N87 


8.2 


Melanoma* M14 


0.0 


Gastric ca. KATO HI 


12.8 


Melanoma* LOXIMVI 


0.0 


Colon ca. SW-948 


1.0 


Melanoma* SK-MEL-5 


0.0 


Colon ca. SW480 


14.8 


Squamous cell carcinoma SCC-4 


0.0 


Colon ca.* (SW480 met) SW620 


29.1 


Testis Pool 


1.3 


Colon ca. HT29 


1.7 


Prostate ca * (bone met) PC-3 


0.0 


Colon ca.HCT-1 16 


12.7 


Prostate Pool 


4.7 


Colon ca. CaCo-2 


12.3 


Placenta 


2.0 


Colon cancer tissue 


5.3 


Uterus Pool 


2.5 


Colon ca. SW1116 


0.0 


Ovarian ca. OVCAR-3 


3.3 


Colon ca. Colo-205 


3.7 


Ovarian ca. SK-OV-3 


2.8 


Colon ca. SW-48 


3.4 


Ovarian ca. OVCAR-4 


3.8 


Colon Pool 


0.9 


Ovarian ca. OVCAR-5 


7.0 


Small Intestine Pool 


1.5 


Ovarian ca. IGROV-1 


10.4 


Stomach Pool 


2.1 • 


Ovarian ca. OVCAR-8 


3.1 


Bone Marrow Pool 


0.5 


Ovary 


1.1 


Fetal Heart 


1.3 


Breast ca. MCF-7 


3.7 


Heart Pool 


0.2 


Breast ca. MDA-MB-231 


6.9 


Lymph Node Pool 


0.9 


Breast ca. BT 549 


2.0 


Fetal Skeletal Muscle 


1.4 


Breast ca. T47D 


1.1 


Skeletal Muscle Pool 


2.3 


Breast ca. MDA-N 


4.3 


Spleen Pool 


0.6 


Breast Pool 


4.9 


Thymus Pool 


2.8 


Trachea 


0.2 


CNS cancer (glio/astro) U87-MG 


0.0 


Lung 


1.1 


CNS cancer (glio/astro) U-l 18-MG 


2.8 
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Fetal Lung 


4.1 


CNS cancer 




Lungca. NCI-N417 


3.9 


CNS cancer (astro) SF-539 


2.2 


Lung ca. LX-1 


30.1 


CNS cancer (astro) SNB-75 


4.6 


Lungca. NCI-H146 


8.4 


CNS cancer (glio) SNB-19 


4.4 


Lun$ ca. SHP-77 


33.4 


CNS cancer (glio) SF-295 


11.4 


Lung ca. A549 


15.3 


Brain (Amygdala) Pool 


15.1 


Lungca. NCI-H526 


4.8 


Brain (cerebellum) 


100.0 


Lungca. NCI-H23 


5.1 


Brain (fetal) 


92.7 


Lung ca. NCI-H460 


7.9 


Brain (Hippocampus) Pool 


32.1 


Lung ca. HOP-62 


0.0 


Cerebral Cortex Pool 


21.8 


Lungca. NCI-H522 


0.0 


Brain (Substantia nigra) Pool 


18.4 


Liver 


0.5 


Brain (Thalamus) Pool 


24.8 


Fetal Liver 


2.0 


Brain (whole) 


29.9 


Liver ca. HepG2 


7.4 


Spinal Cord Pool 


16.3 


Kidney Pool 


1.6 


Adrenal Gland 


2.2 


Fetal Kidney 


3.5 


Pituitary gland Pool 


3.7 


Renal ca. 786-0 


2.4 


Salivary Gland 


1.0 


Renal ca. A498 


2.4 


Thyroid (female) 


2.0 


Renal ca. ACHN 


11.8 


Pancreatic ca. CAPAN2 


0.0 


Renal ca. UO-31 


6.2 


Pancreas Pool 


0.6 



Table RD. Panel 4.1D 

5 



Tissue Name 


Rel. 
Exp.O 
Ag6035, 
Run 

225157775 


Tissue Name 


Rel. 

Exp.(%) 
Ag6035, 
Run 

225157775 


Secondary Th 1 act 


0.0 


HUVEC IL-lbeta 


0.0 


Secondary Th2 act 


0.0 


HUVEC IFN gamma 


0.0 


Secondary Trl act 


0.0 


HUVEC TNF alpha + IFN gamma 


0.0 


Secondary Thl rest 


0.0 


HUVEC TNF alpha + IL4 


0.0 


Secondary Th2 rest 


0.0 


HUVEC IL-11 


0.0 


Secondary Trl rest 


0.0 


Lung Microvascular EC none 


0.0 


Primary Thl act 


0.0 


Lung Microvascular EC TNFalpha 
+ IL-lbeta 


0.0 


Primary Th2 act 


0.0 


Microvascular Dermal EC none 


0.0 


Primary Trl act 


0.0 


Microsvasular Dermal EC 
TNFalpha + IL-lbeta 


0.0 


Primary Thl rest 


0.0 


Bronchial epithelium TNFalpha + 
DLlbeta 


0.0 


Primary Th2 rest 


0.0 


Small airway epithelium none 


ao 


Primary Trl rest 


0.0 


Small airway epithelium TNFalpha 
+ IL-lbeta 


0.0 
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CD45RA CD4 Ivmohocvte act 


0.0 


Coronery artery ISRfC rest*~* 




CD45RO CD4 lymphocyte act 


0.0 


Coronery artery SMC TNFalpha + 
IL-lbeta 


0.0 


CD8 lymphocyte act 


0.0 


Astrocytes rest 


0.0 


Secondary CD8 lymphocyte rest 


r\ a 
0.0 


Astrocytes liNraipna + XL- 1 beta 


n a 
O.U 


Secondary CD 8 lymphocyte act 


a a 
0.0 


KU-olz (oasopnil) rest 


O A 

O.U 


CD4 lymphocyte none 


0.0 


KU-olz (Basophil) 
PA/F A i\ on oinvci n 


0.0 


9rv Th1/Th7/Trl anti-CDQS 
*-* y x ji a i iizj 1 1 1 ajiu-^iyy ~> 

CH11 


0.0 


CCD1 106 (Keratinocytes) none 


0.0 


LAK cells rest 


0.0 


CCD1 106 (Keratinocytes) 
TNFalpha + IL-lbeta 


0.0 


LAK cells IL-2 


0.0 


Liver cirrhosis 


0.0 


LAK cells IL-2+IL-12 


0.0 


NCI-H292 none 


0.0 


LAK cells IL-2+EFN gamma 


0.0 


NCI-H292 IL-4 


0.0 


LAK cells IL-2+IL-18 


0.0 


NCI-H292EL-9 


0.0 


LAK cells PMA/ionomycin 


0.0 


NCI-H292IL-13 


0.0 


NK Cells IL-2 rest 


0.0 


NCI-H292 IFN gamma 


0.0 


Two Way MLR 3 day 


10.1 


HPAEC none 


0.0 


Two Way MLR 5 day 


0.0 


HPAEC TNF alpha + 1L-1 beta 


0.0 


Two Way MLR 7 day 


0.0 


Lung fibroblast none 


0.0 


PBMCrest 


5.5 


Lung fibroblast TNF alpha + IL-1 
beta 


0.0 


PBMCPWM 


0.0 


Lung fibroblast 1L-4 


0.0 




no 


T iincr fiVwAKIact TI -O 


00 

v/.v/ 


Ramos (B cell) none 


0.0 


Lung fibroblast IL-13 


0.0 


Ramos (B cell) ionomycin 


0.0 


Lung fibroblast IFN gamma 


0.0 


B lymphocytes PWM 


0.0 


Dermal fibroblast CCD1070 rest 


0.0 


B lymphocytes CD40L and EL-4 


0.0 


Dermal fibroblast CCD 1070 TNF 


0.0 


EOL-1 dbcAMP 


100.0 


Dermal fibroblast CCD 1070 IT >-l 
beta 


0.0 


EOL-1 dbcAMP 
PMAyionomycin 


36.1 


Dermal fibroblast IFN gamma 


0.0 


Dendritic cells none 


0.0 


Dermal fibroblast BL-4 


0.0 


Dendritic cells LPS 


0.0 


Dermal Fibroblasts rest 


0.0 


Dendritic cells anti-CD40 


0.0 


Neutrophils TNFa+LPS 


0.0 


Monocytes rest 


9.9 


Neutrophils rest 


0.0 


Monocytes LPS 


0.0 


Colon 


0.0 


Macrophages rest 


0.0 


Lung 


0.0 


Macrophages LPS 


0.0 


Thymus 


8.5 


HUVEC none 


0.0 


Kidney 


7.7 


HUVEC starved 


0.0 
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Table RE, Panel 5 Islet 



i issue iNanie 


Rel. 

Exp.(%) 

Ag6035 

Run 

25357828 
4 


Rel. 

Exp.(%) 
Ag6035, 
Run 

30641400 
3 


Tfccnip Name 


ReL 

Exp.(%) 

Ag6035, 

Run 

2535782 

84 


ReL 

Exp.(%) 

Ag6035, 

Run 

3064140 

03 


97457_Patient-02go_adipos 
e 


0.0 


0.0 


94709 JDonor 2 AM - 
A_adipose 


0.0 


0.0 


97476 J > atient-07sk_skeleta 
1 muscle 


0.0 


0.0 


94710_Donor 2 AM - 
B_adipose 


0.0 


0.0 


97477_Patient-07ut_uterus 


00 


0.0 


9471 lJDonor 2 AM- 
CLadipose 


0.0 


0.0 


97478_Patient-07pl_pIacent 
a 


0.0 


0.0 


94712JDonor2AD- 
A_adipose 


0.0 


0.0 


99167 "Raver Patient 1 

77JVJ / _iJ<ljCl 1 All Will 1 


100.0 


100.0 


94713_Donor2AD- 
B_adipose 


0.0 


0.0 


97487 Patient-08ut uterus 


0.0 


0.0 


94714JDonor2AD- 
C_adipose 


0.0 


0.0 


97483J > atient-08pLplacent 
a 


0.0 


0.0 


94742_Donor3U- 
A_Mesenchymal Stem Cells 


0.0 


0.0 


97486 J^atient^sleskeleta 
1 muscle 


0.0 


0.0 


94743_Donor3TJ- 
BJMesenchymal Stem Cells 


0.0 


0.0 


97487 Patient-09ut uterus 


0.0 


0.0 


94730J)onor 3 AM - 
A_adipose 


0.0 


0.0 


97488 JPatient-09pLplacent 
a 


0.0 


0.0 


94731J)onor3AM- 
B_adipose 


0.0 


0.0 


97492 JPatient- 1 Out_uterus 


0.0 


0.0 


94732_Donor 3 AM - 
C_adipose 


0.0 


0.0 


97493 JPatient- lOpLplacent 
a 


0.0 


0.0 


94733 JDonor 3 AD - 
A_adipose 


0.0 


0.0 


97495 JPatienM lgo_adipos 
e 


0.0 


0.0 


94734_Donor3 AD- 
B^adipose 


0.0 


0.0 


97496 _Patient-l lsk_skeleta 
1 muscle 


0.0 


0.0 


94735 J)onor 3 AD - 
Q_adipose 


0.0 


0.0 


97497 JPatient-1 luUiterus 


0.0 


0.0 


77 1 3 8_Li ver_HepG2untreate 

ft 
a 


0.0 


0.0 


97498 ^atient-l IpLpIacent 
a 


0.0 


0.0 


73556JHeart_Cardiac stromal 
cells (primary) 


0.0 


0.0 


97500 ..Patient- 1 2gb_adipos 
e 


0.0 


0.0 


81735_Small Intestine 


0.0 


0.0 


97501J > atierit-12sk_skeIeta 
1 muscle 


0.0 


0.0 


72409 JCidneyJVoximal 
Convoluted Tubule 


0.0 


0.0 


97502 J>atient-12ut_uterus 


0.0 


0.0 


82685__Small 
intestine Duodenum 


0.0 


0.0 


97503_PatienM2pl_placent 
a 


0.0 


0.0 


90650_AdrenaLAdrenocortic 
al adenoma 


0.0 


16.2 
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P C TV KHSHGt B /* 33- 3 ? 3 



94721JDonor21J- 
AJdesenchymal Stem 
Cells 



0.0 



0.0 



72410_KidneyJHRCE 



0.0 



0.0 



94722 JDonor2U- 
B_Mesenchymal Stem 
Cells 



0.0 



0.0 



72411JGdneyJHRE 



0.0 



0.0 



94723 J>onor2U- 
CJtfesenchymal Stem 
Cells 



0.0 



0.0 



73139_UterusJJterine 
smooth muscle cells 



0.0 



0.0 



CNS_neurodegeneration_vl.O Summary: Ag6035 This panel confimis the 
expression of this gene at low levels in the brains of an independent group of individuals. 
5 However, no differential expression of this gene was detected between Alzheimer's 

diseased postmortem brains and those of non-demented controls in this experiment. Please 
see Panel 1.5 for a discussion of this gene in treatment of central nervous system disorders. 

General_screening_panel_vl.5 Summary: Ag6035 Highest expression of this 
gene is detected in cerebellum (CT=27). This gene codes for a splice variant of potassium 
10 channel TREK2. As reported in literature (Bang et al., 2000, J Biol Chem 275(23): 17412-9, 
PMID: 10747911), this gene shows expression preferentially in all the regions of brain. 
Therefore, therapeutic modulation of this gene product may be useful in the treatment of 
central nervous system disorders such as Alzheimer's disease, Parkinson's disease, epilepsy, 
multiple sclerosis, schizophrenia and depression. 

15 Moderate to low levels of expression of this gene is also seen in number of cancer 

cell lines derived from brain, colon, gastric, renal, lung, breast and ovarian cancer. 
Therefore, therapeutic modulation of this gene may be useful in the treatment of these 
cancers. 

In addition, low levels of expression of this gene is also seen in tissues with 
20 metabolic/endocrine functions, including pancreas, adipose, adrenal gland, thyroid, 
pituitary gland, skeletal muscle, heart, liver and the gastrointestinal tract. Therefore, 
therapeutic modulation of the activity of this gene may prove useful in the treatment of 
endocrine/metabolically related diseases, such as obesity and diabetes. 

Panel 4.1D Summary: Ag6035 Highest expression of this gene is detected in 
25 eosinophils (CT=32.5). Low levels of expression of this gene is also seen in 

PMA/ionomycin treated eosinophils. Therefore, therapeutic modulation of this gene or its 
protein product may useful in the treatment of hematopoietic disorders involving 
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eosinophils, parasitic infections, autoimmune and inflammatory* dise!&^ 
and asthma. 

Panel 5 Islet Summary: Ag6035 Two experiments with same probe-primer sets 
are in excellent agreement. Low levels of expression of this gene are restricted to islet cells 
5 (CTs=33-34). This gene codes for a splice variant of potassium channel TREK2. Potassium 
channels play an important role in insulin secretion by islet beta cells upon stimulation by 
glucose. Alteration in the insulin secretion pathway through the use of sulfonylureas or 
genetic inactivation of K(ATP) channels may lead to inappropriate insulin secretion at low 
glucose (Henquin JC, 2000, Diabetes 49(ll):1751-60, PMID: 11078440). Therefore, 
10 therapeutic modulation of this gene or its protein product may be useful in the treatment 
type 2 diabetes. 

S. CG146403-01: Diacylglycerol acyltransferase 2. 

Expression of gene CG146403-01 was assessed using the primer-probe set Ag6034, 
described in Table SA. Results of the RTQ-PCR runs are shown in Tables SB, SC and SD. 
15 Table SA, Probe Name Ag6034 



Primers 


Sequence 


Length 


Start 
Position 


SEQ ID 

No 


Forward 


5 » -tggggagaatgacatctttaga-3 1 


22 


540 


321 


Probe 


TET-5 ' -cttaaggcttttgccacaggctcctg 
-3 * -TAMRA 


26 


562 


322 


Reverse 


5 ■ -agagaagcccatgagcttctt-3 ' 


21 


613 


323 



20 Table SB. General screening panel vl.5 



Tissue Name 


ReL 

Exp.(%) 
Ag6034, 
Run 

228763480 


issue Name 


ReL 

Exp.(%) 
Ag6034, 
Run 

228763480 


Adipose 


0.2 


Renal ca. TK-10 


27.9 


Melanoma* Hs688(A).T 


0.0 


Bladder 


1.2 


Melanoma* Hs688(B).T 


O0 


Gastric ca. (liver met.) NCI-N87 


05 


Melanoma* M14 


0.1 


Gastric ca. KATO m 


7.9 | 


Melanoma* LOX1MVI 


0.0 


Colon ca. SW-948 


3.6 


Melanoma* SK-MEL-5 


0.2 


Colon ca. SW480 


12.5 n 
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Squamous cell carcinoma SCC-4 


0.0 


Colon ca.*(^ra^tHW' 




Testis Pool 


0.2 


Colon ca. HT29 


22.7 


Prostate ca.* (bone met) PC-3 


r 0.4 


Colon ca. HCT-116 


0.0 


Prostate Pool 


0.0 


Colon ca. CaCo-2 


100.0 


Placenta 


0.0 


Colon cancer tissue 


63.3 


Uterus Pool 


0.0 


Colon ca. SW1116 


0.0 


Ovarian ca. OVCAR-3 


0.0 


Colon ca. CoIo-205 


2.1 


Ovarian ca. SK-OV-3 


0.2 


Colon ca. SW-48 


50.0 


Ovarian ca. OVCAR-4 


0.0 


Colon Pool 


0.2 


Ovarian ca. OVCAR-5 


0.1 


Small Intestine Pool 


0.4 


Ovarian ca. IGROV-1 


0.1 


Stomach Pool 


0.0 


Ovarian ca. OVCAR-8 


0.2 


Bone Marrow Pool 


0.0 


Ovary 


0.0 


Fetal Heart 


0.2 


Breast ca. MCF-7 


0.0 


Heart Pool 


0.1 


Breast ca. MDA-MB-231 


0.0 


Lymph Node Pool 


0.0 


Breast ca. BT 549 


0.1 


Fetal Skeletal Muscle 


0.1 


Breast ca. T47D 


0.0 


Skeletal Muscle Pool 


0.0 


Breast ca. MDA-N 


0.0 


Spleen Pool 


0.0 


Breast Pool 


0.0 


Thymus Pool 


0.1 


Trachea 


0.0 


CNS cancer (glio/astro) U87-MG 


0.0 


Lung 


0.0 


CNS cancer (glio/astro) U-118-MG 


0.0 


Fetal Lung 


0.4 


CNS cancer (neuro;met) SK-N-AS 


0.0 


Lung ca. NCI-N417 


0.1 


CNS cancer (astro) SF-539 


0.2 


Lung ca. LX-1 


28.1 


CNS cancer (astro) SNB-75 


0.0 


Lung ca. NCI-H146 


0.0 


CNS cancer (glio) SNB-19 


0.0 


Lung ca. SHP-77 


0.0 


CNS cancer (glio) SF-295 


0.0 


Lung ca. A549 


0.7 


Brain (Amygdala) Pool 


0.0 j 


Lung ca. NCI-H526 


0.0 


Brain (cerebellum) 


0.1 


Lungca. NCI-H23 


0.0 


Brain (fetal) 


0.2 


Lung ca.NCI-H460 


4.2 


Brain (Hippocampus) Pool 


0.0 


Lung ca. HOP-€2 


0.0 


Cerebral Cortex Pool 


0.1 


Lung ca. NCI-H522 


0.2 


Brain (Substantia nigra) Pool 


0.0 


Liver 


1.7 


Brain (Thalamus) Pool 


0.0 


Fetal Liver 


55.9 


Brain (whole) 


1.1 


Liver ca. HepG2 


62.9 


Spinal Cord Pool 


0.0 


Kidney Pool 


0.0 


Adrenal Gland 


o.o 


Fetal Kidney 


5.1 


Pituitary gland Pool 


0.0 


Renal ca. 786-0 


0.0 


Salivary Gland 


0.0 


Renal ca. A498 


0.1 


Thyroid (female) 


o.o ; 


Renal ca. ACHN 


0.0 


Pancreatic ca. CAPAN2 


0.0 ! 


Renal ca.UO-31 


0.0 


Pancreas Pool 


0.0 
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Table SC. Panel 4.1D 



Tissue Name 


Rel. 
Ep.(%) 

Run 

225245213 


Tissue Name 


Rel. 

Exp.(%) 
AgoU.54, 
Run 

225245213 


Secondary Thl act 


0.0 


HUVEC IL-lbeta 


0.0 


Secondary Th2 act 


0.0 


HUVEC IFN gamma 


0.0 


oecondary Irl act 


0.4 


rlU VJbC 1 Nr alpha + xrM gamma 


u.o 


Secondary Thl rest 


0.0 


HUVEC TNF alpha + IL4 


A A 

0.0 


Secondary Th2 rest 


0.0 


HUVEC IL-11 


0.0 


Secondary Trl rest 


0.0 


Lung Microvascular EC none 


0.0 


Primary Thl act 


0.0 


Lung Microvascular EC TNFalpha 
+ IL-lbeta 


0.0 


Primary Th2 act 


0.0 


Microvascular Dermal EC none 


0.0 


Primary Trl act 


0.0 


Microsvasular Dermal EC 

rf-rv n; 1 1 TT 11 a. _ 

TNFalpha + EL- 1 beta 


0.0 


Primary Thl rest 


0.0 


Bronchial epithelium TNFalpha + 
ILlbeta 


0.0 


Primary Th2 rest 


0.0 


Small airway epithelium none 


0.0 


Primary Trl rest 


0.0 


Small airway epithelium TNFalpha 
+ DL-lbeta 


0.0 


CD45RA CD4 lymphocyte act 


0.0 


Coronery artery SMC rest 


0.0 


CD45RO CD4 lymphocyte act 


0.0 


Coronery artery SMC TNFalpha + 
IL-lbeta 


0.6 


CD8 lymphocyte act 


fx i\ 

0.0 


Astrocytes rest 


A A 

0.0 


Secondary CD8 lymphocyte rest 


a a 

0.0 


Astrocytes TNr alpha + LL-lbeta 


A A 

0.0 


Secondary CD 8 lymphocyte act 


a f\ 

0.0 


KU-812 (Basophil) rest 


A A 

0.0 


CD4 lymphocyte none 


0.0 


KU-olZ (basophil) 

PMA/ionomy c in 


0.0 


2rv Thl/Th2/Trl anti-CD95 
CH11 


0.0 


CCD1 106 (Keratinocytes) none 


0.0 


T ATT 1 1 a 

LAK cells rest 


0.0 


CCD1 106 (Keratinocytes) 
TNFalpha -f* IL-lbeta 


A A 

0.0 


LAK cells IL-2 


0.0 


Liver cirrhosis 


17.0 


LAK cells IL-2+DL-12 


0.0 


NCI-H292 none 


0.0 


LAK cells 1L-2+1FN gamma 


0.0 


NCI-H292 1L-4 


0.0 


LAK cells IL-2+ EL-18 


0.0 


NCI-H292 IL-9 


0.0 


LAK cells PMA/ionomycin 


0.0 


NCI-H292 IL-13 


0.0 


NK Cells IL-2 rest 


0.0 


NCI-H292 IFN gamma 


0.0 


Two Way MLR 3 day 


o.o 


HPAEC none 


0.0 


Two Way MLR 5 day 


0.0 


HPAEC TNF alpha + EL-1 beta 


0.0 


Two Way MLR 7 day 


0.0 


Lung fibroblast none 


0.0 


PBMCrest 


0.0 


Lung fibroblast TNF alpha + IL-1 
beta 


0.0 
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PBMCPWM 


0.9 


mm* »f««n w«rr*» ^ ft n if — li *** H K 

Lungfibrob^^ yh! °^' 


1' new. ..if kwi, MM »»Hi 


PBMC PHA-L 


0.0 


Lung fibroblast IL-9 


0.0 


Ramos (B cell) none 


0.0 


Lung fibroblast DL- 1 3 


ao 


Ramos (B cell) ionomycin 


U.U 


LrUng iiDrooiasi urn gajurna 


\j.vJ 


d lymphocytes rWM 


ft ft 


TVrmnl ^rUf/n-'kl oct ewers'] (Y7f\ T-Aet 

jL/ermai xioroDiasi Lwiu/u rest 




B lymphocytes CD40L and EL-4 


0.0 


l/ermai iiDrooiasi i^v-jl/au/u iisr 
alpha 


0.0 


EOL-1 dbcAMP 


0.0 


Dermal fibroblast CCD 1070 DL-l 
beta 


0.0 


EOL-1 dbcAMP 
FMA/ionomycin 


0.0 


Dermal norowast lrJN gamma 


ft ft 


Dendritic cells none 


0.0 


Dermal fibroblast 1L-4 


0.0 


Dendritic cells LPS 


0.0 


Dermal Fibroblasts rest 


0.3 


Dendritic cells anti-CD40 


0.5 


Neutrophils TNFa+LPS 


4.0 


Monocytes rest 


0.0 


Neutrophils rest 


0.0 


Monocytes LPS 


0.0 


Colon 


81.2 


Macrophages rest 


0.0 


Lung 


4.7 


Macrophages LPS 


0.0 


Thymus 


18.0 


HUVEC none 


0.0 


Kidney 


100.0 


HUVEC starved 


0.0 







Table SD. Panel 5 Islet 



Tissue Name 


Rel. 

Exp.(%) 

Ag603, 

Run 

256791126 


Tissue Name 


Rel. 

Exp.(%) 
Ag6034, 
Run 

256791126 


97457 ^atient^go.adipose 


0.0 


94709 JDonor 2 AM - A_adipose 


0.0 


97476_Patient-07sk_skeletaI 
muscle 


0.0 


94710JDonor 2 AM - B_adipose 


0.0 


97477 JPatient-07ut_uterus 


0.0 


947 11 JDonor 2 AM - Cadipose 


0.0 


97478 JPatient-07pLplacenta 


0.0 


947 12 JDonor 2 AD - A_adipose 


0.0 


99167J3ayer Patient 1 


0.0 


94713_Donor 2 AD - B_adipose 


0.0 


97482 J^tient-OSuUiterus 


0.0 


94714JDonor 2 AD - C_adipose 


0.0 


97483 JPatient-08pljpIacenta 


0.0 


94742 J)onor 3 V - A ^Mesenchymal 
Stem Cells 


0.0 


97486_Patient-09sk_skeletal 
muscle 


0.0 


94743 JDonor 3 U - B Jvlesenchymal 
Stem Cells 


0.0 


97487J > atient-09ut_uterus 


0.0 


94730 JDonor 3 AM - A_adipose 


0.0 


97488 JPatient-09pU>lacenta 


0.0 


94731 JDonor 3 AM - B_adipose 


0.0 


97492 JPatient-lOuUuterus 


0.0 


94732 JDonor 3 AM - C_adipose 


0.0 


97493 ^Patient-lOpLplacenta 


0.0 


94733_Donor 3 AD - A_adipose 


0.0 


97495 JPatient-1 lgo_adipose 


0.0 


94734 JDonor 3 AD - B_adipose 


0.0 
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y /4yo_ratienw lsK^sKeietaj 
muscle 


0.0 


PCT/UODE/i 

94735 J>onor 3 AD - C_adipose 


0.0 


97497^116^-1 lut_uterus 


0.0 


77138_LiverJfepG2untreated 


100.0 


97498_Patient-l lpLpIacenta 


0.0 


/jDj0^n.ean^,v-.araiac suoraai ceus 
(priinary) 


0.0 


y / juu^Jratient-izgo_aaipose 


yJV 


oi /jj_omaii mtesune 




97501JPatient-12sk_skeletal 
muscle 


0.0 


72409_Kidney J*roximal Convoluted 

Tn Will ^ 

iUDUie 


0.0 


97502JPatient-12ut_uterus 


0.0 


82685JSmall intestine JDuodenum 


31.2 


97503 J^tient-ttpLplacenta 


0.0 


90650 _Adrenal_Adrenocortical 
adenoma 


0.0 


94721_Donor2U- 
A_JvIesenchymal Stem Cells 


0.0 


72410_KidneyJIRCE 


0.0 


94722 JDonor2U- 
B_Mesenchymal Stem Cells 


0.0 


72411JKidney_HRE 


0.0 


94723 J3onor2U- 
C_Mesenchymal Stem Cells 


o.o 


73139_Utenis_Uterine smooth 
muscle cells 


0.0 



CNS_neurodegeneration_vl.O Summary: Ag6034 Expression of this gene is 
low/undetectable in all samples on this panel (CTs>35). (Data not shown.) 

5 GeneraLscreeningjpaneLvl.5 Summary: Ag6034 Highest expression of this 

gene is seen in colon cancer (CT=26.3). High to moderate levels of expression are also seen 
in colon, renal, liver and lung cancer cell lines, as well as in fetal lung. This expression 
suggests that this gene may be involved in these cancers. Thus, expression of this gene 
could be used to differentiate between these samples and other samples on this panel and as 
10 a marker of these cancers. Therapeutic modulation of the expression or function of this 
gene may also be useful in the treatment of these cancers. 

Panel 4.1D Summary: Ag6034 Expression of this gene is highest in colon and 
kidney (CTs=30). Thus, expression of this gene could be used as a marker of these tissues. 
Panel 5 Islet Summary: Ag6034 Highest expression of this gene is seen in a liver 
15 cell line (CT=30.6). Thus, expression of this gene could be used to differentiate between 
this sample and other samples on this panel. 

T. CG146513-01: Diacylglycerol acyltransferase 2. 

Expression of gene CG146513-01 was assessed using the primer-probe set Ag6036, 
described in Table TA. Results of the RTQ-PCR runs are shown in Table TB. 
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Table TA. Probe Name Ag6036 



Primers 


Sequence 


Length 


Start 
Position 


SEQID 
No 


Forward 


5 1 -tggaccctatggaagtatttcc-3 1 


22 


326 


324 


Probe 


TET-5 1 -ttcccagtacagctggtgaagactca 
-3 1 -TAMRA 


26 


356 


325 


Reverse 


5 ' -gttgtgtttgggagaaagatca-3 * 


22 


382 


326 



5 

Table TB. Panel 5 Islet 



1 ISMJc liaulc 


Reh 

Exp.(%) 
Run 

279370869 


i issue fNanie 


Rel. 

Exp.(%) 

AgOU^O, 

Run 

279370869 


97457 JPatient-02go_adipose 


10.5 


94709 JDonor 2 AM - A_adipose 


11.4 | 


97476J > atient-07sk_skeletal 
muscle 


0.0 


94710JDonor 2 AM - B_adipose 


6.7 


97477_Patient-07ut_uterus 


3.3 


9471 LDonor 2 AM - C_adipose 


4.2 i 


97478 JPatient-07pl_placenta 


6.0 


94712 JDonor 2 AD - A_adipose 


23.8 


99167 JSayer Patient 1 


3.3 


947 13 JDonor 2 AD - B_adipose 


32.8 


97482 J>atient-08ut_utenis 


2.6 


94714JDonor 2 AD - C_adipose 


22.2 


97483 JPatient-08pl_pIacenta 


1.0 


94742 JDonor 3 U - A Jtfesenchymal 
Stem Cells 


2.6 


97486 JPatient-09sk_skeletal 
muscle 


8.4 


94743 JDonor 3 U - BJvfesenchymal 
Stem Cells 


2.5 


97487 JPatient-09ut_uterus 


5.8 


94730 JDonor 3 AM - A_adipose 


12.9 


97488_Patient-09pLplacenta 


2.2 


94731 JDonor 3 AM - B_adipose 


21.0 


97492J > atient-10ut_uterus 


4.0 


94732 JDonor 3 AM - C_adipose 


20.4 


97493 J'atient-IOpLplacenta 


3.2 


94733JDonor 3 AD- A_adipose 


26.4 


97495_Patient-l lgo_adipose 


6.0 


94734 JDonor 3 AD - B_adipose 


25.5 


97496 JPatient-1 lsk_skeletal 
muscle 


20.2 


94735 JDonor 3 AD - C_adipose 


6.5 


97497 JPatient-1 lut_uterus 


8.7 


77138JUverJHepG2untreated 


41.5 


97498 J'atient-l IpLplacenta 


1.9 


73556JHeart_Cardiac stromal cells 
(primary) 


1.6 


97500J > atient-12go - adipose 


4.0 


8 1735_Small Intestine 


10.7 


97501J > atient-12sk_skeietal 
muscle 


22.2 


72409 JQdneyJfroximal Convoluted 
Tubule 


100.0 


97502 J'atient-l 2ut w uterus 


7.1 


82685_Small intestine JDuodenum 


15.7 


97503_Patient-12pLplacenta 


1.3 


90650_AdrenaJ_Adrenocortical 
adenoma 


5.0 
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94721J>onor2U- 
A_Mesenchymal Stem Cells 


12.8 


ikiu «">"romi::/ 

72410_Kidney_JHRCE 


31.2 


94722JDonor2U- 
B__MesenchymaI Stem Cells 


6.8 


72411_KidneyJHRE 


9.1 


94723_Donor2U- 
C_Mesenchymal Stem Cells 


11.2 


73139_UterusJlJterine smooth 
muscle cells 


13.3 



CNS_neurodegeneration_vl.O Summary: Ag6036 Expression of this gene is 
low/undetectable in all samples on this panel (CTs>35). 

GeneraLscreening_paneLvl.5 Summary: Ag6036 Expression of this gene is 
low/undetectable in all samples on this panel (CTs>35). 

Panel 4 .ID Summary: Ag6036 Expression of this gene is low/undetectable in all 
samples on this panel (CTs>35). 

Panel 5 Islet Summary: Ag6036 Highest expression of this gene is seen in a 
kidney derived sample (CT=29.5). Moderate levels of expression are seen in many samples 
on this panel, including samples from uterus, placenta, adipose, and skeletal muscle. Thus, 
this gene may be involved in diseases of these tissues, including obesity and diabetes. 

U. CG146522-01: Diacylglycerol acyltransferase 2. 

Expression of gene CG146522-01 was assessed using the primer-probe set Ag6037, 
described in Table UA. Results of the RTQ-PCR runs are shown in Table UB. 
Table UA. Probe Name Ae6037 



Primers 


Sequence 


Length 


Start 
Position 


SEQID 
No 


Forward 


5 ' -attccaagcagcctagtcactt-3 1 


22 


49 


327 


Probe 


TET-5 ' -ttctgcagtggcctttgagctacctt 
-3 * -TAMRA 


26 


85 


328 


Reverse 


5 ' -cagcaggtagacgaacaagatg-3 ' 


22 


113 


329 
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Table UB. Panel 5 Islet 



Tissue Name 


Rel. 

Exp.O 
Ag6037, 

Dim 

Kun 

279370870 


Tissue Name 


Rel. 

Exp.(%) 
Ag6037, 
Run 

279370870 


7/40/ _^anentH/.zgo_auipose 


ft ft 


y4 i\)y_}jonoT z /\jvi - A_aoipose 


A A 


y /4 /o Jrmient-u /sK^sKeJetal 
muscle 


0.0 


94710JDonor 2 AM - B_adipose 


0.0 


97477 Patient-07ut uterus 


0.0 


947 1 1 Donor 7 AM - O adinnse 


0 0 


97478_Patient-07pl__placenta 


0.0 


94712 Donor 2 AD - A adinose 


o.o 


99167 Bayer Patient 1 


0.9 


94713 Donor 2 AD - B adinose 


00 


97482_Patient-08ut_uterus 


0.8 


94714_Donor 2 A]D - C_adipose 


0.0 


97483 JPatient-08pl_placenta 


0.0 


94742_Donor 3 U - A_Mesenchymal 
Stem Clelk 


0.0 


97486 JPatient-09sk_skeletal 

11 1110 VIC 


9.0 


94743 _Donor 3 U - B Jvlesenchymal 

Qtpm Oils 


0.0 


97487 Patient-fKJnt uterus 




Q47^0 Dnnnr ^ AM - A aHir»nci» 


ft ft 


97488_Patient-09pLplacenta 


0.0 


9473 l^Donor 3 AM - B_adipose 


0.0 


97492 JPatient- 1 0ut_uterus 


0.5 


94732_Donor 3 AM - C_adipose 


0.0 


97493_Patient- 1 Opl_placenta 


3.5 


94733JDonor 3 AD - A_adipose 


0.0 


97495 JPatient-1 lgo_adipose 


1.2 


94734 JDonor 3 AJ> - B_adipose 


0.9 


97496JPatient-l Isleskeletal 
muscle 


39.2 


94735 JDonor 3 AD - C_adipose 


0.0 


97497 JPatient- 1 lut_uterus 


0.0 


77 1 38 JLiver w HepG2untreated 


0.0 


97498_Patient-l lpl_placenta 


0.0 


73556_Heart_Cardiac stromal cells 
(primary) 


0.0 


97500 JPatient-12go_adipose 


1.7 


81735_Small Intestine 


1.0 


9750 1 JPatient- 1 2sk_skeletal 
muscle 


100.0 


72409 JtidneyJProximal Convoluted 
Tubule 


0.0 


97502 JPatient-12ut_uterus 


0.0 


82685_Small intestine J>uodenum 


0.0 


97503 J*atient-12pLplacenta 


1.0 


90650_AdrenaLAdrenocortical 
adenoma 


0.0 


94721JDonor2U- 
AJMesenchymal Stem Cells 


0.0 


724I0JtidneyJHRCE 


0.0 


94722_Donor2U- 
B_Mesenchyraal Stem Cells 


0.0 


72411JGdney_HRE 


0.0 


94723JDonor2U- 
C_Mesenchymal Stem Cells 


0.5 


73139JJterus_Uterine smooth 
muscle cells 


0.0 



5 

CNS_neurodegeneration_vl.O Summary: Ag6037 Expression of this gene is 
low/undetectable in all samples on this panel (CTs>35). 
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General_screening_panel_vl.5 Summary: AgdUJ^bXpr^Wdfthis g0i*r? ,r -=» 
low/undetectable in all samples on this panel (CTs>35). 

Panel 4.1D Summary: Ag6037 Expression of this gene is low/undetectable in all 
samples on this panel (CTs>35). 

5 Panel 5 Islet Summary: Ag6037 Expression of this gene is limited to skeletal 

muscle (CTs=30-31). Thus, expression of this gene could be used to differentiate these 
samples from other samples on this panel and as a marker of this tissue. Furthermore, 
therapeutic modulation of the expression or function of this gene may be useful in the 
treatment of metabolic disorders, including obesity and diabetes. 

10 V. CG146531-01: DIACYLGLYCEROL ACYLTRANSFERASE 

2. 

Expression of gene CG146531-01 was assessed using the primer-probe set Ag6038, 
described in Table VA. 

Table VA. Probe Name Ag6038 

15 



Primers 




Length 


Start 
Position 


SEQID 
No 


Forward 


5 • -aaggtgtcacaggaagagcat-3 1 


21 


10 


330 


Probe 


TET-5 ' -agccaggtcaccatggctttcttct- 
3 ' -TAMRA 


25 


49 


331 


Reverse 


5 ' -gccctcctggagattcagt-3 ' 


19 


78 


332 



CNS_neurodegeneration_vl.O Summary: Ag6038 Expression of this gene is 
20 low/undetectable in all samples on this panel (CTs>35). 

GeneraLscreenin£j>aneI_v:L5 Summary: Ag6038 Expression of this gene is 
low/undetectable in all samples on this panel (CTs>35). 

Panel 4.1D Summary: Ag6038 Expression of this gene is low/undetectable in all 
samples on this panel (CTs>35). 

25 Panel 5 Islet Summary: Ag6038 Expression of this gene is low/undetectable in all 

samples on this panel (CTs>35). 

W. CG147274-01: Protease. 
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Expression of gene CG147274-01 was assessed uKrfgrtHei)MTB^®/seP^582§; 
described in Table WA. 

Table WA. Probe Name A&5623 



Primers 




Length 


Start 
Position 


SEQID 
No 


Forward 


5 ' -gatgtgctgccttcagaatg-3 ' 


20 


64 


333 


Probe 


TET-5 ' -aatcctcccggcctccttggagt-3 
'-TAMRA 


23 


89 


334 


Reverse 


5 ' -gtccttcctgggtgtcttg-3 ' 


19 


121 


335 



CNS_neurodegeneration_vl.O Summary: Ag5623 Expression of this gene is 
low/undetectable in all samples on this panel (CTs>35). 

GeneraLscreening_panel_vl.5 Summary: Ag5623 Expression of this gene is 
low/undetectable in all samples on this panel (CTs>35). 

Panel 4.1D Summary: Ag5623 Expression of this gene is low/undetectable in all 
samples on this panel (CTs>35). 

X. CG147419-01: GLUTAMINE: FRUCTOSE-6-PHQSPHATE 
AMTOOTRANSFERASE 1 MUSCLE, 

Expression of gene CG147419-01 was assessed using the primer-probe set Ag5207, 
described in Table XA. Results of the RTQ-PCR runs are shown in Tables XB, XC, XD 
and XE. 

Table XA, Probe Name Ag5207 



Primers 


Sequenes 


Length 


Start 
Position 


SEQ ID 
No 


Forward 


5 1 -gccctctgttgattggtgta-3 ' 


20 


736 


336 


Probe 


TET-5 ' -cggagtgaacataaactttctactgat 
ca-3 ' -TAMRA 


29 


756 


337 


Reverse 


5 '-ccaatctgagtcctagctgttc-3 ■ 


22 


802 


338 
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Table XB.CNS neurodegeneration vl.O 



Tissue Name 


Kei. 

KXD.f %} 

Ag5207, 
Run 

226559656 


issue Name 


ivei* 

Exp.(%) 
Ag5207, 
Run 

226559656 


AD 1 Hippo 


11.3 


Control (Path) 3 Temporal Ctx 


2.3 


AD 2 Hippo 


14.6 


Control (Path) 4 Temporal Ctx 


54.7 


AD 3 Higpo 


0.0 


AD 1 Occipital Ctx 


1.8 


AD 4 Hippo 


6.3 


AD 2 Occipital Ctx (Missing) 


0.0 


AD 5 hippo 


100.0 


AD 3 Occipital Ctx 


1.7 


AD 6 Hippo 


29.3 


AD 4 Occipital Ctx 


11.5 


Control 2 Hippo 


59.0 


AD 5 Occipital Ctx 


21.0 


Control 4 Hippo 


0.0 


AD 6 Occipital Ctx 


97.9 


Control (Path) 3 Hippo 


1.8 


Control 1 Occipital Ctx 


0.0 


AD 1 Temporal Ctx 


12.5 


Control 2 Occipital Ctx 


100.0 


AD 2 Temporal Ctx 


41.5 


Control 3 Occipital Ctx 


13.3 


AD 3 Temporal Ctx 


2.2 


Control 4 Occipital Ctx 


2.2 


AD 4 Temporal Ctx 


24.1 


Control (Path) 1 Occipital Ctx 


100.0 


AD 5 M Temporal Ctx 


65.5 


Control (Path) 2 Occipital Ctx 


7.2 


AD 5 SupTemporal Ctx 


29.1 


Control (Path) 3 Occipital Ctx 


0.0 


AD 6 Inf Temporal Ctx 


26.2 


Control (Path) 4 Occipital Ctx 


18.9 


AD 6 Sup Temporal Ctx 


49.3 


Control 1 Parietal Ctx 


2.5 


Control 1 Temporal Ctx 


0.0 


Control 2 Parietal Ctx 


53.2 


Control 2 Temporal Ctx 


88.3 


Control 3 Parietal Ctx 


21.6 


Control 3 Temporal Ctx 


19.5 


Control (Path) 1 Parietal Ctx 


94.6 


Control 4 Temporal Ctx 


4.9 


Control (Path) 2 Parietal Ctx 


16.8 


Control (Path) 1 Temporal Ctx 


97.3 


Control (Path) 3 Parietal Ctx 


4.0 


Control (Path) 2 Temporal Ctx 


48.0 


Control (Path) 4 Parietal Ctx 


50.3 



5 

Table XC. General screening panel vl.5 



Tissue Name 


Rel. 

Exp.(%) 
Ag5207, ; 
Run 

228757767 


issue Name 


Rel. 

Exp.(%) 
Ag5207, 
Run 

228757767 


Adipose 


9.9 


Renal ca. TK-10 


2.9 


Melanoma* Hs688(A).T 


4.0 


Bladder 


2.2 


Melanoma* Hs688(B)T 


12.1 


Gastric ca. (liver met.) NCI-N87 


23.2 


Melanoma* Ml 4 


4.1 


Gastric ca. KATO HI 


17.4 


Melanoma* LOXIMVI 


0.7 


Colon ca. SW-948 


0.4 
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Melanoma* SK-MEL-5 


1.8 


Colon ca.S?4lo l/UBUri ' 




Squamous cell carcinoma SCC-4 


0.7 


Colon ca.* (SW480 met) SW620 


0.1 


Testis Pool 


2.8 


Colon ca. HT29 


0.3 


Prostate ca.* (bone met) PC-3 


6.3 


Colon ca.HCT- 116 


0.2 


Prostate Pool 


4.1 


Colon ca. CaCo-2 


1.6 


Placenta 


0.2 


Colon cancer tissue 


1.3 


Uterus Pool 


5.6 


Colon ca. SW1116 


0.0 


Ovarian ca. OVCAR-3 


0.2 


Colon ca. Colo-205 


2.6 


Ovarian ca. SK-OV-3 


5.9 


Colon ca. SW-48 


0.8 


Ovarian ca. OVCAR-4 


1.2 


Colon Pool 


11.3 


Ovarian ca. OVCAR-5 


1.6 


Small Intestine Pool 


4.2 


Ovarian ca. IGROV-1 


0.8 


Stomach Pool 


2.9 


Ovarian ca. OVCAR-8 


1.7 


Bone Marrow Pool 


2.1 


Ovary 


0.7 


Fetal Heart 


45.7 


Breast ca. MCF-7 


0.3 


Heart Pool 


38.2 


Breast ca. MDA-MB-231 


3.8 


Lymph Node Pool 


11.3 


Breast ca. BT 549 


1.3 


Fetal Skeletal Muscle 


19.3 


Breast ca.T47D 


0.0 


Skeletal Muscle Pool 


100.0 


Breast ca. MDA-N 


0.2 


Spleen Pool 


0.5 


Breast Pool 


6.4 


Thymus Pool 


4.0 


Trachea 


1.0 


CNS cancer (glio/astro) U87-MG 


11.0 


Lung 


1.5 


CNS cancer (glio/astro) U-118-MG 


24.0 


Fetal Lung 


1.2 


CNS cancer (neuro;met) SK-N-AS 


3.4 


Lung ca. NCI-N417 


0.7 


CNS cancer (astro) SF-539 


1.0 


Lung ca. LX-l 


0.6 


CNS cancer (astro) SNB-75 


1.4 


Lung ca. NCI-H146 


0.5 


CNS cancer (glio) SNB-19 


1.2 


Lung ca. SHP-77 


0.4 


CNS cancer (glio) SF-295 


18.6 


Lung ca. A549 


4.8 


Brain (Amygdala) Pool 


3.7 


Lung ca. NCI-H526 


0.6 


Brain (cerebellum) 


4.6 


Lung ca. NCI-H23 


0.2 


Brain (fetal) 


0.2 


Lungca. NCI-H460 


3.2 


Brain (Hippocampus) Pool 


3.1 


Lung ca. HOP-62 


4.3 


Cerebral Cortex Pool 


6.7 


Lung ca. NCI-H522 


2.0 


Brain (Substantia nigra) Pool 


4.3 


Liver 


0.1 


Brain (Thalamus) Pool 


8.2 


Fetal Liver 


0.4 


Brain (whole) 


4.4 


Liver ca. HepG2 


0.4 


Spinal Cord Pool 


1.2 


Kidney Pool 


14.4 


Adrenal Gland 


2.6 


Fetal Kidney 


0.2 


Pituitary gland Pool 


1.5 


Renal ca. 786-0 


1.2 


Salivary Gland 


0.4 


Renal ca. A498 


1.2 


Thyroid (female) 


0.4 


Renal ca. ACHN 


1.6 


Pancreatic ca. CAPAN2 


2.8 


Renal ca.UO-31 


1.5 


Pancreas Pool 


6.0 
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Table XD. Panel 4.1D 





KCI. 

Ep.(%) 

\g5207, 

Run 

229739304 


Tissue Name 


Exp.(%) 
\g5207, 
Run 

229739304 


Secondary Thl act 


0.0 


HUVEC IL-lbeta 


16.0 


Secondary Th2 act 


4.2 


HUVEC IFN gamma 


9.6 


kj&CULIU<U jr 111 aWl 


0.0 


HUVEC TNF alpha + IFN gamma 


3.5 


occunuary 1111 lGol 


0.0 


HUVEC TNF alpha + IL4 


0.0 


Secondary Th2rest 


0.0 


HUVEC IL-11 


5.5 


Secondary Trl rest 


0.0 


Lung Microvascular EC none 


1 1 


Primary Thl act 


0.0 


Lung JViicrovascmar cx^ i XNr aipna 
+ IL-lbeta 


0.0 


Primary Th2 act 


5.6 


Microvascular Dermal EC none 


u.u 


Primary Trl act 


5.6 


Microsvasular Dermal EC 
TNFalnha + IL-lbeta 


0.0 


Primary Thl rest 


0.0 


Bronchial epithelium TNFalpha + 
BLlbeta 


0.0 


iTimary 1 nz resi 


0 0 


Small airwav enithelium none 


5.8 


Primary Trl rest 


0.0 


Small airway epithelium TNFalpha 
+ IL-lbeta 


3.6 


CD45RA CD4 lymphocyte act 


35.6 


Coronery artery SMC rest 


7.4 


CD45RO CD4 lymphocyte act 


0.0 


Coronery artery oMl iiNrarpna + 
TT -Iheta 


13.6 


PFlft 1vm-nV»i*v*vt<* art 

wo lympnuuyic dti 


7.8 


Astrocytes rest 


0.0 


oeconooiy \~^ljo lyiiiptiucyic iwi 


00 

V/.v 


Astrocytes TNFalpha + IL-lbeta 


12.9 


Qprnn^arv f^T^S lvmnVmpvtf* art 
ociuiiuoi y Ksi-JO i y y ic a^i 


0.0 


KLJ-812 (Basophil) rest 


0.0 


CD4 lymphocyte none 


10.7 


KU-812 (Basophil) 
PMA/ionomycin 


0.0 


2ry Thl/Th2/Trl_anti-CD95 
CH11 


A A 


t_A^ui luo ^Jveraunocyies,/ none 


IO.U 




0.0 


CCD1106 (Keratinocytes) 
TNFalpha + IL-lbeta 


0.0 


LAK cells IL-2 


0.0 


Liver cirrhosis 


0.0 


LAK cells IL-2+IL-12 


0.0 


NCI-H292 none 


0.0 


LAK cells IL-2+IFN gamma 


0.0 


NCI-H292IL^ 


0.0 


LAK cells IL-2+ 1L-18 


0.0 


NCI-H292 EL-9 


0.0 


LAK cells PMA/ionomycin 


6.0 


NCI-H292 IL-13 


0.0 


NKCeUs IL-2 rest 


4.8 


NCI-H292 IFN garnrna 


0.0 


Two Way MLR 3 day 


0.0 


HPAEC none 


4.7 


Two Way MLR 5 day 


0.0 


HPAEC TNF alpha + LL-1 beta 


20.9 


Two Way MLR 7 day 


0.0 


Lung fibroblast none 


17.7 


PBMCrest 


7.0 


Lung fibroblast TNF alpha + IL-1 
beta 


23.0 
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PBMC PWM 


0.0 


LungfibrobEM /yS «^ 




PBMC PHA-L 


0.0 


Lunp fibroblast IL-Q 


11. 5 


Ramos (B cell) none 


0.0 


Lung fibroblast IL-13 


9.2 


RamCK (R celH innnmvrin 


n n 


Lung fibroblast IFN gamma 


33.0 


B Ivmnhncvtes PWM 


\J.\J 


Dermal fibroblast CCD1070 rest 


41.8 


B lymphocytes CD40L and IL-4 


0.0 


jjermaj iiDroDJast CCDIU/U INJr 
alpha 


100.0 


EOL-1 dbcAMP 


0.0 


Dermal fibroblast PPDlfnn TT 1 
beta 




77.9 


EOL-1 dbcAMP 
PMA/ionomycin 




Dermal fibroblast IFN gamma 


7.6 


Dendritic cells none 


5.1 


Dermal fibroblast IL-4 


15.3 


Dendritic cells LPS 


0.0 


Dermal Fibroblasts rest 


34.6 


Dendritic cells anti-CD40 


0.0 


Neutrophils TNFa+LPS 


4.8 


Monocytes rest 


6.6 


Neutrophils rest 


0.0 


Monocytes LPS 


0.0 


Colon 


0.0 


Macrophages rest 


0.0 


Lung 


12.3 


Macrophages LPS 


0.0 


Thymus 


0.0 


HUVEC none 


6.0 


Kidney 


0.0 


HUVEC starved 


29.5 


i 



Table XE. Panel 5 Islet 



5 



Tissue Name 


ReL 
Exp.() 
Ag5207, 
Run 

263594763 


Tissue Name 


Rel. 

Exp.(%) 
Ag5207, 
Run 

263594763 


97457JPatient-02go_adipose 


2.0 


94709_Donor 2 AM - A adipose 


4.6 


97476JPatient-07sk_skeletal 
muscle 


3.1 


94710JDonor 2 AM - B_adipose 


1.1 


97477 JPatient-07ut_uterus 


3.2 


9471 1 Donor 2 AM - C adipose 


0.8 


97478 J>atient-07pLplacenta 


2.0 


94712J)onor 2 AD - A_adipose 


1.0 


99167 Jayer Patient 1 


1.0 


94713 Donor 2 AD -B adipose 


8.1 


97482_Patient-08ut_uterus 


6.7 


94714J)onor2AD-C adipose 


5.3 


97483J > atient-08pl_placenta 


0.0 


94742 J)onor 3 U - AJtf esenchymal 
Stem Cells 


1.2 


97486_PaUent-09sk w skeletal 
muscle 


27.4 


94743 J)onor 3 U - B Jrfesenchymal 
Stem Cells 


3.7 


97487_Patient-09ut_uterus 


12.4 


94730 J)onor 3 AM - A^adipose 


4.6 


97488J>atient-09pl_placenta 


1.3 


94731JDonor3AM-B adipose 


2.1 


97492J > atient-10ut^uterus 


14.4 


94732JDonor 3 AM - C_adipose 


1.0 


97493_Patient-10pLplacenta 


2.1 


94733 J)onor 3 AD - A^adipose |6 9 


97495^^^-1 lgo_adipose 


2.0 


94734 JDonor 3 AD - B adipose |3.2 
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y / £ fyo__raiient-i 1 sK^sKeJetai 
muscle 


50.3 


94735 JDonor 3 AD - C_adipose 


4.4 


97497_Patient-l lut_uterus 


7.1 


77138JLiver_HepG2untreated 


3.4 


97498JPatient-l lpl ^placenta 


0.0 


73556 Jtfeart_Cardiac stromal cells 
(primary) 


2.2 


y i juv - .Jrauent-izgo_aaipose 


10.7 


81735_Small Intestine 


7.1 


97501JPatient-12sk_skeletal 

fJJUSCIC 


100.0 


72409 Jtidney_Proximal Convoluted 
Tubule 


00 

v.v 


97502_Patient- 1 2ut_utenis 


10.9 


82685_SmalI intestine_Duodenum 


0.0 


97503_PatienM2pl_placenta 


0.0 


90650_AdrenaLAdrenocortical 
adenoma 


0.0 


94721JDonor2U- 
A_Mesenchymal Stem Cells 


1.8 


72410_Kidney_HRCE 


4.9 


94722 JDonor2U- 
B_Mesenchymal Stem Cells 


1.0 


72411_Kidney_HRE 


0.0 


94723_Donor2U- 
C_Mesenchymal Stem Cells 


3.5 


73139JJterusJLJterine smooth 
muscle cells 


4.0 



CNS_neurodegeneration_vl.O Summary: Ag5207 This panel does not show 
differential expression of this gene in Alzheimer's disease. However, this profile confirms 
the expression of this gene at moderate levels in the brain. Please see Panel 1.5 for 
discussion of this gene in the central nervous system. 

GeneraLscreening_paneLvl.5 Summary: Ag5207 Highest expression of this 
gene is seen in skeletal muscle (CT=28). Low but significant expression is ^Iso seen in 
pancreas, adrenal, pituitary, adipose, adult and fetal heart, and fetal skeletal muscle. This 
gene encodes a protein that is homologous to Glutamine:fructose-6-phosphate 
amidotransferase (GFAT) which catalyzes the formation of glucosamine 6-phosphate and ii 
the first and rate-limiting enzyme of the hexosamine biosynthetic pathway. Enhanced 
glucose flux via the hexosamine biosynthetic pathway has been implicated in in the 
induction of insulin resistance. Buse et al. showed in a mouse model that glucose flux via 
the hexosamine pathway is selectively increased in muscle and may contribute to muscle 
insulin resistance in non-insulin-dependent diabetes mellitus. (Am J Physiol 1997 
Jun;272(6 Pt l):E1080-8). Thus, based on the homology of this enzyme to GFAT and the 
high expression in muscle, modulation of the expression or function of this gene may be 
useful in the treatment of type II diabetes. 

This gene is widely expressed on this panel with moderate to low expression seen 
throughout the CNS, including the hippocampus, thalamus, substantia nigra, amygdala, 
cerebellum and cerebral cortex. Therefore, therapeutic modulation of the expression or 
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function of this gene may be useful in the treatment ot neuraio'gfCaniiWfa^fS; stftff 
Alzheimer's disease, Parkinson's disease, schizophrenia, multiple sclerosis, stroke and 
epilepsy. 

Moderate to low levels of expression are also seen in many cancer cell lines on this 
panel, including gastric cancer and melanoma cell lines. Thus, modulation of this gene 
product may be useful in the treatment of cancer. 

Panel 4.1D Summary: Ag5207 Detectable levels of expression appear to be 
restricted to TNF-alpha treated dermal fibroblasts (CT=33.3). This expression suggests that 
this gene product may be involved in skin disorders, including psoriasis. 

Panel 5 Islet Summary: Ag5207 Highest expression is seen in skeletal muscle 
(CT=30.2), in agreement with panel L5. Moderate to low levels of expression are also seen 
in other metabolic tissues, including uterus and adipose. Please see Panel 1.5 for discussion 
of this gene in metabolic disease. 

Y. CG148102-01: CARNITINE 
O-PALMITOYLTRANSFERASE I. 

Expression of gene CG148102-01 was assessed using the primer-probe set Ag5274, 
described in Table YA. Results of the RTQ-PCR runs are shown in Tables YB, YC, YD 
and YE. 

Table YA. Probe Name Ag5274 



Primers 




Length 


Start 
Position 


SEQID 
No 


Forward 


5 ' -cacttccgggacccacagt-3 1 


19 


1732 


339 


Probe 


TET-5 ■ -caccaggctctgctgaaggcagcc- 
3 1 -TAMRA 


24 


1783 


340 


Reverse 


5 1 -caaacaggtggcggtcaact-3 1 


20 


1821 


341 < 



Table YB.CNS neurodegeneration vl.Q 



Tissue Name 


Rel. 

Exp.(%) 
Ag5274, 
Run 

230512893 


issue Name 


Rel. 

Exp.(%) 
Ag5274, 
Run 

230512893 
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AD 1 Hippo 


19.3 


Control (Pain) ^TemporaTCtx *~ 


^1373 


AD 2 Hippo 


33.2 


Control (Path) 4 Temporal Ctx 


29.7 


AD 3 Hippo 


11.7 


AD 1 Occipital Ctx 


18.3 


AD 4 Hippo 


9.9 


AD 2 Occipital Ctx (Missing) 


0.0 


AD 5 Hippo 


95.9 


AD 3 Occipital Ctx 


7.5 


AD 6 Hippo 


43.5 


AD 4 Occipital Ctx 


15.1 


Control 2 Hippo 


57.0 


AD 5 Occipital Ctx 


66.4 


Control 4 Hippo 


11.9 


AD 6 Occipital Ctx 


13.1 


Control (Path) 3 Hippo 


8.5 


Control 1 Occipital Ctx 


3.7 


AD 1 Temporal Ctx 


17.0 


Control 2 Occipital Ctx 


98.6 


AD 2 Temporal Ctx 


29.5 


Control 3 Occipital Ctx 


27.5 


AD 3 Temporal Ctx 


8.3 


Control 4 Occipital Ctx 


4.5 


AD 4 Temporal Ctx 


19.6 


Control (Path) 1 Occipital Ctx 


100.0 


AD 5 Inf Temporal Ctx 


95.9 


Control (Path) 2 Occipital Ctx 


17.1 


AD 5 Sup Temporal Ctx 


53.6 


Control (Path) 3 Occipital Ctx 


3.8 


AD 6 Inf Temporal Ctx 


29.9 


Control (Path) 4 Occipital Ctx 


20.0 


AD 6 Sup Temporal Ctx 


33.2 


Control 1 Parietal Ctx 


10.5 


Control 1 Temporal Ctx 


8.4 


Control 2 Parietal Ctx 


49.3 


Control 2 Temporal Ctx 


70.2 


Control 3 Parietal Ctx 


19.2 


Control 3 Temporal Ctx 


25.0 


Control (Path) 1 Parietal Ctx 


94.6 


Control 3 Temporal Ctx 


11.3 


Control (Path) 2 Parietal Ctx 


25.0 


Control (Path) 1 Temporal Ctx 


74.2 


Control (Path) 3 Parietal Ctx 


6.0 


Control (Path) 2 Temporal Ctx 


44.4 


Control (Path) 4 Parietal Ctx 


50.7 



Table YC. General, screening panel vl.5 

5 



Tissue Name 


Rel. 

Exp.(%) 
Ag5274, 
Run 

230762793 


issue Name 


Rel. 

Exp.(%) 
Ag5274, 
Run 

230762793 


Adipose 


1.2 


Renal ca. TK-10 


0.0 


Melanoma* Hs688(A).T 


7.4 


Bladder 


1.7 


Melanoma* Hs688(B).T 


13.0 


Gastric ca. (liver met.) NCI-N87 


1.0 


Melanoma* M14 


0.1 


Gastric ca. KATO HI 


0.2 


Melanoma* LOXIMVI 


o.o 


Colon ca. SW-948 


1.4 


Melanoma* SK-MEL-5 


0.0 


Colon ca. SW480 


0.7 


Squamous cell carcinoma SCC-4 


1.5 


Colon ca * (SW480 met) SW620 


0.0 


Testis Pool 


2.1 


Colon ca. HT29 


0.2 


Prostate ca.* (bone met) PC-3 


21.8 


Colon ca.HCT-116 


2.1 


Prostate Pool 


0.8 


Colon ca. CaCo-2 


0.3 


Placenta 


0.7 


Colon cancer tissue 


2.4 


Uterus Pool 


0.7 


Colon ca.SWl 116 


0.0 
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Ovarian ca. OVCAR-3 


12.2 


Colon ca.Cclc-W ' 


^ qJ» ~Jlr .miJI J\ tm 3> 


Ovarian ca. SK-OV-3 


0.2 


Colon ca. SW-48 


0.0 


Ovarian ca. OVCAR-4 


0.1 


Colon Pool 


3?5 


Ovarian ca. OVCAR-5 


2.8 


Small Intestine Pool 


2.1 


Ovarian ca. IGROV-1 


7.2 | 


Stomach Pool 


1.8 


Ovarian ca. OVCAR-8 


3.9 


Bone Marrow Pool 


0.8 


Ovary 


6.3 


Fetal Heart 


1.7 


Breast ca. MCF-7 


0.2 


Heart Pool 


1.5 


Breast ca. MDA-MB-231 


4.9 


Lymph Node Pool 


5.3 


Breast ca. BT 549 


88.3 


Fetal Skeletal Muscle 


1.0 


Breast ca. T47D 


0.0 


Skeletal Muscle Pool 


0.8 


Breast ca. MDA-N 


0.0 


Spleen Pool 


3.0 


Breast Pool 


4.9 


Thymus Pool 


2.7 


Trachea 


1.0 


CNS cancer (glio/astro) U87-MG 


27.7- 


Lung 


0.9 


CNS cancer (glio/astro) U-118-MO 


27.4 


Fetal Lung 


7.2 


CNS cancer (neuro;met) SK-N-AS 


86.5 


Lung ca. NCI-N417 


8.2 


CNS cancer (astro) SF-539 


0.0 


Lung ca. LX-1 


0.5 


CNS cancer (astro) SNB-75 


0.5 


Lungca. NCI-H146 


16.2 


CNS cancer (glio) SNB-19 


7.2 


Lung ca. SHP-77 


53.6 


CNS cancer (glio) SF-295 


17.3 


Lung ca. A549 


0.0 


Brain (Amygdala) Pool 


19.9 


Lungca. NCI-H526 


3.6 


Brain (cerebellum) 


100.0 


Lungca. NCI-H23 


40.9 


Brain (fetaD 


44.8 


Lungca. NCI-H460 


0.6 


Brain (Hippocampus) Pool 


16.8 


Lung ca. HOP-62 


1.6 


Cerebral Cortex Pool 


24.0 


Lung ca. NCI-H522 


57.8 


Brain (Substantia nigra) Pool 


27.4 


Liver 


0.3 


Brain (Thalamus) Pool 


34.2 


Fetal Liver 


0.9 


Brain (whole) 


42.0 


Liver ca. HepG2 


0.0 


Spinal Cord Pool 


10.5 


Kidney Pool 


4.2 


Adrenal Gland 


1.0 


Fetal Kidney 


3.6 


Pituitary gland Pool 


4.9 


Renal ca. 786-0 


0.0 


Salivary Gland 


0.1 


Renal ca. A498 


0.0 


Thyroid (female) 


0.6 


Renal ca. ACHN 


0.5 


Pancreatic ca. CAPAN2 


0.0 


Renal ca. UO-31 


0.3 


Pancreas Pool 


4.8 



Table YD. Panel 4.1D 

5 





Rel. 




Rel. 




Ep.(%) 




Exp.(%) 


Tissue Name 


Ag5274, 


Tissue Name 


Ag5274, 




Run 




Run 




230472159 




230472159 
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Secondary Thl act 


2.3 


HUVECIL?b£ T - irUOOg - 




Secondary Th2 act 


1.6 


HUVEC EFN gamma 


92.0 


Secondary Trl act 


0.0 


HUVEC TNF alpha + IFN gamma 


15.1 


Secondary Thl rest 


0.0 


HI TVFP TNT? alnha 4. TT A 


1 1 *7 
11./ 


Secondary Th2 rest 


2.3 


HUVEC IL-11 


67.8 


Secondary Trl rest 


0.0 


Lung Microvascular EC none 


38.2 


Primary Thl act 


0.0 


Lung Microvascular EC TNFalpha 
+ DL-lbeta 


9.2 


Primary Th2 act 


0.0 


Microvascular Dermal EC none 


26.2 


Primary Trl act 


0.0 


Microsvasular Dermal EC 
TNFalpha + IL-1 beta 


9.0 


Primary Thl rest 


0.0 


Bronchial epithelium TNFalpha + 
ll. JLoeia 


0.0 


Liiiiiaiy Lilt* icbi 




Small airway epithelium none 


0.0 


Primary Trl rest 


0.0 


Small airway epithelium TNFalpha 
+ IL-lbeta 


0.0 


PD4CD A CT\A Ivrrmhrtr'x/tf* art 


7 R 


Coronery artery SMC rest 


JO.O 


CD45RO CD4 lymphocyte act 


0.0 


Coronery artery SMC TNFalpha + 

TT 1 

JUL- 1 beta 


66.9 


CD8 lymphocyte act 


0.0 


Astrocytes rest 


23.2 


Secondary CD8 lymphocyte rest 


0.0 


Astrocytes TNFalpha + H^lbeta 


14.8 


Secondary CD8 lymphocyte act 


0.0 


KU-812 (Basophil) rest 


0.0 


CD4 lymphocyte none 


0.0 


KU-812 (Basophil) 
PMA/ionomycin 


0.0 


Liy 1 n u i i n__anu-v^iyy j 
CH11 


0.0 


CCD1 106 (Keratinocytes) none 


31.9 


LAK cells rest 


0.0 


CCD1 106 (Keratinocvtes} 
TNFalpha + IL-lbeta 


9.4 


LAK cells IL-2 


0.0 


Liver cirrhosis 


5.1 


LAK cells IL-2+IL-12 


0.0 


NCI-H292 none 


0.0 


LAK cells IL-2+IFN gamma 


0.0 


NCI-H292 IL^ 


0.0 


LAK cells IL-2+IL-18 


0.0 


NCI-H292 IL-9 


0.0 


LAK cells PMA/ionomycin 


0.0 


NCI-H292 IL-13 


0.0 


NK Cells IL-2 rest 


2.5 


NCI-H292 IFN gamma 


8.6 


Two Way MLR 3 day 


0.0 


HPAEC none 


45.4 


Two Way MLR 5 day 


0.0 


HPAEC TNF alpha + IL-1 beta 


27.9 


Two Way MLR 7 day 


0.0 


Lung fibroblast none 


100.0 


PBMC rest 


0.0 


Lung fibroblast TNF alpha + IL-1 
beta 


90.8 


PBMCPWM 


2.2 


Lung fibroblast 1L-4 


22.2 


PBMCPHA-L 


10.1 


Lung fibroblast IL-9 


47.6 


Ramos (B cell) none 


0.0 


Lung fibroblast IL-13 


11.8 


Ramos (B cell) ionomycin 


0.0 


Lung fibroblast IFN gamma 


61.1 


B lymphocytes PWM 


0.0 


Dermal fibroblast CCD 1070 rest 


28.7 


B lymphocytes CD40L and EL-4 


2.2 j 


Dermal fibroblast CCD1070 TNF 
alpha 


23.3 
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EOL-1 dbcAMP 


9.2 


JLSCJIIldJ UUXO Ol dol V^\-JL/1U /U 1U-1 

beta 


('■3JL3-7J 
28.7 


EOL-1 dbcAMP 
PMA/ionomycin 


2.7 


Derma] fibroblast IFN gamma 


16.7 


Dendritic cells none 


0.0 


Dermal fibroblast EL-4 


13.1 


Dendritic cells LPS 


0.0 


Dermal Fibroblasts rest 


58.6 


Dendritic cells anti-CD40 


0.0 


Neutrophils TNFa+LPS 


0.0 


Monocytes rest 


0.0 


Neutrophils rest 


0.0 


Monocytes LPS 


0.0 


Colon 


0.0 


Macrophages rest 


0.0 


Lung 


1.7 


Macrophages LPS 


0.0 


Thymus 


0.0 


HUVEC none 


48.3 


Kidney 


5.5 ! 


HUVEC starved 


61.1 







Table YE. Panel 5 Islet 

5 



Tissue Name 


Rel. 
Exp.0 
Ag5274, 
Run 

307720339 


Tissue Name 


Rel. 

Exp.(%) 
Ag5274, 
Run 

307720339 


97457_Patient-02go_adipose 


15.3 


94709 JDonor 2 AM - A_adipose 


13.9 


97476JPatient-07sk_skeletal 
muscle 


0.0 


94710_Donor 2 AM - B_adipose 


15.2 


97477 JPatient-07uUiterus 


13.7 


9471 1 JDonor 2 AM - C_adipose 


19.8 


97478 JPatient-07pLplacenta 


9.0 


94712_Donor 2 AD - A_adipose 


58.2 


99167 JBayer Patient 1 


51.8 


94713JDonor 2 AD - B_adipose 


29.7 


97482_Patient-08ut_uterus 


24.3 


94714_Donor 2 AD - C_adipose 


34.9 


97483_Patient-08pl_placenta 


0.0 


94742 JDonor 3 U - A_Mesenchymal 
Stem Cells 


62.9 


97486JPatient-09sk_skeletaI 
muscle 


0.0 


94743_Donor 3 U - B_Mesenchymal 
Stem Cells 


39.5 


97487_Patient-09ut_uterus 


7.3 


94730JDonor 3 AM - A_adipose 


31.4 


97488_Patient-09pLplacenta 


11.9 


9473 1 JDonor 3 AM - B_adipose 


35.1 


97492_PatienM0ut_uterus 


12.8 


94732 JDonor 3 AM - C_adipose 


49.3 


97493_Patient-10pl_placenta 


5.3 


94733 JDonor 3 AD - A^adipose 


28.9 


97495_Patient-l Igo_adipose 


5.3 


94734_Donor 3 AD - B__adipose 


44.8 


97496_Patient-l lsk_skeleta1 
muscle 


3.8 


94735_Donor 3 AD - C_adipose 


17.7 


97497 JPatient-1 lut_uterus 


20.9 


77138JLiver_JIepG2untreated 


6.0 


97498_Patient-l lpl_j>lacenta 


5.4 


73556JHeart_Cardiac stromal cells 
(primary) 


55.5 


97500J > atienM2go_adipose 


27.0 


81735_Small Intestine 


39.0 
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97501_Patient-l 2sk_skeletal 
muscle 


12.5 


72409JQd^ ' 
Tubule 


31373 
15.2 


97502 JPatient- 12ut_uterus 


10.2 


82685_Small intestine J>uodenum 


0.0 


97503JPatient-12pLplacenta 


2.4 


90650 J^drenal_AdrenocorticaI 
adenoma 


12.2 


94721JDonor2U- 
A_Mesenchymal Stem Cells 


100.0 


72410Jtfdney_HRCE 


0.0 


94722_Donor2U- 
B_Mesenchymal Stem Cells 


43.2 


72411_Kidney_HRE 


25.7 


94723 _J>onor2U- 
CJMesenchymal Stem Cells 


63.7 


73139_Uterus_Uterine smooth 
muscle cells 


97.9 



CNS_neurodegeneration_vl.0 Summary: Ag5274 This panel confirms the 
expression of this gene at low levels in the brain in an independent group of individuals. 
5 This gene appears to be slightly down-regulated in the temporal cortex of Alzheimer's 
disease patients. Therefore, up-regulation of this gene or its protein product, or treatment 
with specific agonists for this receptor may be of use in reversing the dementia, memory 
loss, and neuronal death associated with this disease. 

General_screening_paneLvl.5 Summary: Ag5274 Highest expression of this 
10 gene is seen in the cerebellum (CT=29.3). Moderate expression of this gene is seen 

throughout the brain. Thus, this gene would be useful for distinguishing brain tissue from 
non-neural tissue, and may be beneficial as a drug target in neurodegenerative disease, and 
specifically disorders that have this brain region as the site of pathology, such as autism and 
the ataxias. Please see Panel_CNS_neurodegeneration for further discussion of potential 
15 utility in the central nervous system. 

Low but significant expression is also seen in pancreas. This gene encodes a protein 
with homology to carnitine palmitoyltransferase. Giannessi et al has shown that inhibition 
of this enzyme produces a significant reduction in serum glucose levels (J Med Chem 2001 
Jul 19;44(15):2383-6). Thus, modulation of this enzyme may also be useful in the treatment 
20 of obesity and/or diabetes. 

Panel 4.1D Summary: Ag5274 Highest expression of this gene is seen in 
untreated lung fibroblasts. Low, but significant expression is also seen in a cluster of 
treated and untreated lung and dermal fibroblasts. Low levels of expression are also seen in 
coronary artery SMCs, and HUVECs. This profile suggests that this gene could be used to 
25 differentiate between these cells and other cells samples. In addition, this gene product may 
be involved in inflammatory conditions of the lung and skin. 
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Panel 5 Islet Summary: Ag5274 Expression is lffiitecM a^M^iy r aettvgcilT6fe ,r ^ 
mesenchymal stem cells (CTs=34.5). 

Z. CG148431-01 and CG148431-02: AMINOTRANSFERASE 
SIMILAR TO SERINE PALMOTYLTRANSFERASE. 

5 Expression of gene CG 148431-01 and CG148431-02 was assessed using the 

primer-probe set Ag5627, described in Table ZA. Results of the RTQ-PCR runs are shown 
in Tables ZB, ZC, ZD and ZE. Please note that CG148431-02 represents a full-length 
physical clone of the CGI 4843 1-01 gene, validating the prediction of the gene sequence. 
Table ZA- Probe Name Ae5627 



Primers 


Sequenes 


Length 


Start 
Position 


SEQID 
No 


Forward 


5 1 -gggctcctataacttccttggt-3 ' 


22 


555 


342 


Probe 


TET-5 ' -tcctcatagactcatcatacttggctg 
ca-3 »-TAMRA 


29 


579 


343 


Reverse 


5 1 -cctgtgccatacacctctaaaa-3 * 


22 


620 


344 



Table ZB. CNS neurodegeneration vl.O 

15 



Tissue Name 


Rel. 

Exp.(%) 
Ag5627, 
Run 

246956910 


ReL 

Exp.(%) 
Ag5627, 
Run 

264979289 


issue Name 


ReL 

Exp.(%) 
Ag5627, 
Run 

246956910 


ReL 

Exp.(%) 
Ag5627, 
Run 

264979289 


AD 1 Hippo 


17.4 


57.0 


Control (Path) 3 
Temporal Ctx 


6.4 


8.2 


AD 2 Hippo 


67.8 


4.8 


Control (Path) 4 
Temporal Ctx 


10.3 


24.0 


AD 3 Hippo 


50.0 


62.4 


AD 1 Occipital Ctx 


11.8 


26.8 


AD 4 Hippo 


19.1 


30.8 


AD 2 Occipital Ctx 
(Missing) 


0.0 


0.0 


AD 5 Hippo 


17.0 


31.2 


AD 3 Occipital Ctx 


4.2 


25.9 


AD 6 Hippo 


100.0 


86.5 


AD 4 Occipital Ctx 


20.0 


27.9 


Control 2 Hippo 


24.1 


31.6 


AD 5 Occipital Ctx 


37.4 


17.0 


Control 4 Hippo 


50.7 


70.7 


AD 6 Occipital Ctx 


29.1 


22.4 


Control (Path) 3 
Hippo 


21.0 


24.3 


Control 1 Occipital 
Qx 


3.9 


12.1 


AD 1 Temporal Ctx 


43.8 


65.5 


Control 2 Occipital 
Ctx 


20.6 


29.9 
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AD 2 Temporal Ctx 


47.6 


100.0 


i P G IfV) 

Control 3 Occipitaf 

Ctx 


OCT OS? A 
9.3 


r3 J.'17'i 

mm4< imUm •«•>< -» - 

19.9 


AD 3 Temporal Ctx 


11.0 


23.0 


Control 4 Occipital 
Ctx 


16.3 


44.1 


AD 4 Temporal Ctx 


20.4 


33.9 


Control (Path) 1 
Occipital Ctx 


49.0 


58.2 


AD 5 Inf Temporal 
Ctx 


31.0 


31.2 


Control (Path) 2 
Occipital Ctx 


6.6 


15.2 


AD 5 Sup Temporal 
Ctx 


51.1 


63.3 


Control (Path) 3 
Occipital Ctx 


0.0 


1.6 


AD 6 Inf Temporal 
Ctx 


68.8 


87.7 


Control (Path) 4 
Occipital Ctx 


23.3 


14.3 


AD 6 Sup Temporal 
Ctx 


56.3 


97.3 


Control 1 Parietal 
Ctx 


13.1 


18.3 


Control 1 Temporal 
Ctx 


7.3 


4.5 


Control 2 Parietal 
Ctx 


31.6 


68.8 


Control 2 Temporal 
Ctx 


12.9 


31.6 


Control 3 Parietal 
Ctx 


7.9 


19.8 


Control 3 Temporal 
Ctx 


7.9 


15.0 


Control (Path) 1 
Parietal Ctx 


63.7 


87.1 


Control 3 Temporal 
Ctx 


13.8 


15.6 


Control (Path) 2 
Parietal Ctx 


51.1 


57.4 


Control (Path) 1 
Temporal Ctx 


30.1 


46.0 


Control (Path) 3 
Parietal Ctx 


3.1 


6.1 


Control (Path) 2 
Temporal Ctx 


28.7 


39.5 


Control (Path) 4 
Parietal Ctx 


54.7 


59.5 



Table ZC. Panel 4.1D 

5 



Tissue Name 


Rel. 
Ep.(%) 
Ag5627, 
Run 

246490777 


Tissue Name 


Rel. 

Exp.(%) 
Ag5627, 
Run 

246490777 


Secondary Thl act 


0.0 


HUVEC IL-lbeta 


0.0 


Secondary Th2 act 


0.4 


HUVEC IFN gamma 


16.7 


Secondary Trl act 


0.0 


HUVEC TNF alpha + IFN gamma 


0.3 


Secondary Thl rest 


0.0 


HUVEC TNF alpha + BV4 


0.0 


Secondary Th2 rest 


0.0 


HUVEC IL-11 


1.2 


Secondary Trl rest 


0.0 


Lung Microvascular EC none 


0.4 


Primary Thl act 


0.0 


Lung Microvascular EC TNFalpha 
+ IL-lbeta 


0.0 


Primary Th2 act 


0.2 


Microvascular Dermal EC none 


0.0 


Primary Trl act 


0.2 


Microsvasular Dermal EC 
TNFalpha + IL-lbeta 


0.0 
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Primary Thl rest 


0.0 


Bronchial epfthelium TNF alpha 4 
ILlbeta 


f 3 1,117 

m>}i*t J* 

8.4 


31 


Primary Th2 rest 


0.0 


Small nirwitv pnithpltnm nrvnf* 


18 7 




Primary Trl rest 


0.0 


Small airwav pnithplinm TNTFalnha 

+ DL-lbeta 


24.3 




CD45RA CD4 lymphocyte act 


2.7 


Ocvronerv artprv SMC rp<;t 






CD45RO CD4 lymphocyte act 


6.8 


Coronery artery SMC TNFalpha + 
IL-lbeta 


2.8 




CD8 lymphocyte act 


0.0 


Astrocytes rest 


3.9 




oeconaary cjjo lympnocyte rest 


U.o 


Astrocytes TNFalpha + IL-lbeta 


1.4 




oeconaary LX>o lympnocyte act 


0.0 


r/Tl Q\f\ /T"l . 

KU-812 (Basophil) rest 


8.0 


CD4 lymphocyte none 


0.0 


KU-812 (Basophil) 

A I\ ATI ATTIVCItl 


14.2 


2rY Thl/Th2/Trl anti-CD95 
CH11 


0.4 


CCD 1106 (Keratinocytes) none 


17.4 


LAK cells rest 


0.0 


CCD1 106 (Keratinocytes) 
TNFalpha + 1L-Ibeta 


24.3 


LAK cells EL-2 


0.0 


Liver cirrhosis 


13.3 


LAK cells EL-2+1L-12 


0.2 


NCI-H292 none 


10.2 


LAK cells IL-2+1FN gamma 


0.0 


NCI-H292 DL-4 


36.3 


LAK cells BL-2+ IL-18 


o.o 


NCI-H292 EL-9 


21.5 


LAK cells PMA/ionomycin 


0.2 


NCI-H292 EL-13 


27.7 


NK Cells EL-2 rest 


11.8 


NCI-H292 EFN gamma 


18.3 


Two Way MLR 3 day 


0.4 


HPAEC none 


0.8 


Two Way MLR 5 day 


0.0 


HPAEC TNF alpha + DL-1 beta 


0.3 


Two Way MLR 7 day 


0.0 


Lung fibroblast none 


21.5 


PBMC rest 


0.0 


Lung fibroblast TNF alpha + IL-1 
beta 


2.7 


PBMCPWM 


0.0 


Lung fibroblast IL-4 


10.2 


PBMC PHA-L 


1.3 


T nno fihrnhlact 77 -Q 




Ramos (B cell) none 


0.0 


Lung fibroblast JJL-13 


1.3 


Ramos (B cell) ionomycin 


0.0 


Lung fibroblast IFN gamma 


43.5 


d lympnocytes rwjvi 


u.u 


Dermal tibroblast CCD 1 070 rest 


0.0 


B lymphocytes CD40L and IL-4 


0.0 


Dermal fibroblast CCD 1 070 TNF 

alnha 
alalia 


1.1 


EOL-1 dbcAMP j 


3.5 


Dermal fibroblast CCD 1 070 TT -I 
beta ! 


1.6 


EOL-1 dbcAMP 
PMA/ionomycin 


A A 
0.0 


Dermal fibroblast IFN gamma 


39.5 


Dendritic cells none 


1.1 


Dermal fibroblast EL-4 


12.0 


Dendritic cells LPS 


0.0 


Dermal Fibroblasts rest j 


16.0 


Dendritic cells anti-CD40 


0.0 ' 


Neutrophils TNFa+LPS 


0.0 


Monocytes rest 


0.0 


Neutrophils rest 


o.o 


Monocytes LPS 


0.0 


Colon 


3.0 


Macrophages rest 


0.0 


Lung 


4.6 


Macrophages LPS 


0.0 


Thymus 


3.5 
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HUVEC none 


0.7 


Kidney -PCTVUOQg 




HUVEC starved 


2.9 







Table ZD. Panel 5 Islet 



5 



Tissue Name 


Kel. 

Exp.(%) 
Run 

27937148 
3 


Rel, 

Exp.(%) 
Run 

31285250 
5 


Tissue Name 


Rel. 

Exp.(%) 
Ag5627, 
Run 
279Y71 4 
83 


Rel. 

Exp.(%) 
Ag5627, 
Run 

05 


97457JPatient-02go_adipos 
e 


0.7 


1.7 


94709 Donor 2 AM - 
A_adipose 


1.2 


1.6 


97476_Patient-07sk_skeleta 
1 muscle 


0.0 


0.0 


94710_Donor 2 AM - 
B_adipose 


1.1 


1.7 


97477_Patient-07ut_uterus 


0.4 


0.5 


9471 l_Donor 2 AM - 
C_adipose 


0.8 


1.4 


97478_Patient-07pLplacent 
a 


40.3 


46.0 


94712_Donor2AD- 
A_adipose 


2.7 


2.0 


99167 JBayer Patient 1 


0.1 


0.1 


94713JDonor2AD- 
B_adipose 


4.0 


3.0 


97482JPatient-08ut_jiterus 


0.2 


0.2 


94714 Donor 2 AD- 
CLadipose 


3.0 


3.0 


97483 J>atient-08pU>lacent 
a 


82.9 


100.0 


94742 JDonor 3 U - 
A_MesenchymaI Stem Cells 


0.4 


0.4 


97486_Patient-09sk_skeleta 
1 muscle 


0.2 


0.1 


94743 JDonor 3 U - 
B_Mesenchymal Stem Cells 


0.3 


0.6 


y /4o /_ratient-09ut_uterus 


0.2 


0.5 


94730 JDonor 3 AM- 
A_adipose 


3.5 


3.7 


97488JPatient-09pLplacent 
a 


29,9 


25.5 


94731 J)onor 3 AM - 
B_adipose 


5.3 


5.6 


97492 Patient- 1 Out uterus 


0.3 




94732JDonor3AM- 
C_adipose 




A O 

4.8 


97493 JPatient-lOpLplacent 
a 


100.0 


71.7 


94733JDonor3AD- 
A_adipose 


2.6 


3.5 


97495 JPatient-1 lgo_adipos 
e 


1.2 


0.9 j 


94734_J)onor3AD- 
B_adipose 


2.8 


3.6 


97496_Patient-l lsk_skeleta 
1 muscle 


0.2 


0.1 


94735_Donor3AD- 
C_adipose 


0.5 


0.8 


97497_PatienM luUiterus 


0.5 


0.8 


77138_Liver_HepG2untreate 
d 


39.5 


43.2 


97498 JPatient-1 lpLplacent 
a 


28.1 


31.6 


73556_Heart_Cardiac stromal 
cells (primary) 


0.1 


0.0 


97500_Patient-12go_adipos 
e 


1.0 


1.8 


81735_Small Intestine 


1.8 


1.9 


97501_Patient-12sk w skeleta 
1 muscle 


0.5 


0.6 


72409 JGdneyJ?roxima] 
Convoluted Tubule 


18.2 


19.1 
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97502_Patient-12ut_uterus 


0.3 


0.4 


intestineJDuodenum 


IE/ 
1.3 


l.l 


97503 JPatient-12pLplacent 
a 


85.9 


88.3 


90650_AdrenaLAdrenocortic 
al adenoma 


0.6 


0.4 


94721JDonor2U- 
AJtfesenchymal Stem 
Cells 


1.2 


1.3 


72410jadney_HRCE 


3.7 


4.9 


94722J)onor2U- 
B_Mesenchymal Stem 
Cells 


0.6 


0.8 


72411JCidney_HRE 


1.6 


1.7 


94723_Donor2U- 
C_Mesenchymal Stem 
CeUs 


1.0 


1.3 


73139JJterusJLJterine 
smooth muscle cells 


1.0 


0.7 



Table ZE. general oncology screening panel v 2.4 



5 



Tissue Name 


ReL 

Exp.(%) 
Ag5627, 
Run 

268787222 


Tissue ame 


ReL 

Exp.(%) 
Ag5627, 
Run 

268787222 


Colon cancer 1 


2.8 


Bladder NAT 2 


0.3 


Colon NAT 1 


2.7 


Bladder NAT 3 


0.2 


Colon cancer 2 


7.8 


Bladder NAT 4 


1.1 


Colon NAT 2 


3.1 


Prostate adenocarcinoma 1 


11.8 


Colon cancer 3 


5.7 


Prostate adenocarcinoma 2 


1.0 


Colon NAT 3 


6.4 


Prostate adenocarcinoma 3 


8.6 


Colon malignant cancer 4 


3.0 


Prostate adenocarcinoma 4 


1.7 


Colon NAT 4 


2.4 


Prostate NAT 5 


1.1 


Lung cancer 1 


2.9 


Prostate adenocarcinoma 6 


2.6 


Lung NAT 1 


1.1 


Prostate adenocarcinoma 7 


3.3 


Lung cancer 2 


16.2 


Prostate adenocarcinoma 8 


0.6 


Lung NAT 2 


2.3 


Prostate adenocarcinoma 9 


6.5 


Squamous cell carcinoma 3 


4.8 


Prostate NAT 10 


1.4 


Lung NAT 3 


0.5 


Kidney cancer 1 


14.2 


Metastatic melanoma 1 


8.7 


Kidney NAT 1 


7.6 


Melanoma 2 


3.7 


Kidney cancer 2 


100.0 


Melanoma 3 


9.2 


Kidney NAT 2 


15.6 


Metastatic melanoma 4 


16.3 


Kidney cancer 3 


38.7 


Metastatic melanoma 5 


20.2 


Kidney NAT 3 


6.5 


Bladder cancer 1 


1.3 


Kidney cancer 4 


11.8 


Bladder NAT 1 


0.0 


Kidney NAT 4 


6.9 '^n 


Bladder cancer 2 


3.9 
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CNS_neurodegeneration_vl.O Summary: Ag5$^iT/eJ^ 3 
probe-primer sets are in good agreements. This panel confirms the expression of this gene 
at low levels in the brain in an independent group of individuals. This gene is found to be 
upregulated in the temporal cortex of Alzheimer's disease patients. Therefore, therapeutic 
modulation of the expression or function of this gene may decrease neuronal death and be 
of use in the treatment of this disease. 

Panel 4.1D Summary: Ag5627 Highest expression of this gene is detected in 
kidney. Moderate to low levels of expression of this gene is also seen in activated naive and 
memory T cells, DL-2 treated NK cells, IFN gamma activated HUVEC cells, cytokine 
activated bronchial epithelial cells, astrocytes, resting and activated small airway epithelial 
cells, coronery artery SMC cells, basophils, keratinocytes, mucoepidermoid NCI-H292 
cells, lung and dermal fibroblast, liver cirrhosis sample and normal tissues such as colon, 
lung, and thymus. Therefore, therapeutic modulation of this gene or its protein product 
through the use of small molecule drug may be useful in the treatment of autoimmune and 
inflammatory diseases such as asthma, allergies, inflammatory bowel disease, lupus 
erythematosus, psoriasis, rheumatoid arthritis, and osteoarthritis. 

Panel 5 Islet Summary: Ag5627 Two experiments with same probe and primer 
sets are in good agreements. Highest expression of this gene is detected in placenta of 
diabetic and nondiabetic patients (CTs=26.4-26.7). Moderate to high levels of expression of 
this gene is also seen in liver HepG2 cell line, adipose, small intestine and kidney. This 
gene codes for a homolog of Serine palmitoyltransferase 2. Serine palmitoyltransferase 
catalyzes the first, rate limiting step in de novo ceramide biosynthesis. C2-ceramide inhibits 
GLUT4 translocation by inhibiting Akt phosphorylation and activation in 3T3-L1 
adipocytes, independently of effects on IRS-l (Summers et al., 1998, Mol Cell Biol 
18:5457-64, PMID: 9710629). Ceramide downregulates PDE3B and induces lipolysis in 
3T3-L1 cells. Ceramide effects are reversed by troglitazone (Mei et al., 2002, Diabetes 51: 
631-7, PMID: 1 1872660). Palmitate-induced insulin resistance involves elevation of de 
novo ceramide synthesis in C2C12 myotubes (Schmitz-Peiffer et al., 1999, J Biol Chem 
274:24202, PMID: 10446195). Therefore, inhibition of the novel serine 
palmitoyltransferase through the use of small molecule drug may be beneficial in the 
treatment of diabetes. 

general oncology screening paneLv_2.4 Summary: Ag5627 Highest expression 
of this gene is detected in kidney cancer (CT=27.5). Moderate to high expression of this 
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gene is also seen in normal and cancer samples derived frtSTnfcoFoii, hl^UffideiT^fSstftf 
and kidney. Moderate levels of expression of this gene is also seen in melanoma and 
metastatic melanoma samples. Expression of this gene is strongly associated with kidney, 
lung and bladder cancers as compared to the corresponding normal tissues. Therefore, 
5 expression of this gene may be used as diapostic marker for detection of these cancers and 
also, therapeutic modulation of this gene or its protein product may be useful in the 
treatment of melanoma, colon, lung, bladder, prostate and kidney cancers. 

AA. CG148888-01: GALNAC 4-SULFOTRANSFERASE. 

Expression of gene CG148888-01 was assessed using the primer-probe set Ag6854, 
10 described in Table AAA. Results of the RTQ-PCR runs are shown in Table AAB. Please 
note that CG148888-01 represents a full-length physical clone. 

Table AAA. Probe Name Ag6854 



Primers 


Sequenes 


Length 


Start 
Position 


SEQID 
No 


Forward 


5 ' -accccagagccgcctggt-3 1 


18 


369 


345 


Probe 


TET-5 ' -cttggcctgatgttgaactttattcctg 
gcacc-3 1 -TAMRA 


33 


408 


346 


Reverse 


5 ' -cagcctgcaggaccctacg-3 ' 


19 


458 


347 



15 

Table AAB. General screening panel vl.6 



Tissue Name 


Rel. 

Exp.(%) 
Ag6854, 
Run 

278020603 


issue Name 


Rel. 

Exp.(%) 
Ag6854, 
Run 

278020603 


Adipose 


0.0 


Renal ca. TK-10 


0.0 


Melanoma* Hs688(A).T 


0.0 


Bladder 


0.1 


Melanoma* Hs688(B).T 


0.2 


Gastric ca. (liver met.) NCI-N87 


0.0 


Melanoma* M14 


0.0 


Gastric ca. KATO HI 


0.0 


Melanoma* LOXMVI 


0.0 


Colon ca. S W-948 


0.0 


Melanoma* SK-MEL-5 


0.3 


Colon ca. SW480 


0.1 


Squamous cell carcinoma SCC-4 


0.1 


Colon ca.* (SW480 met) SW620 


0.0 


Testis Pool 


0.2 


Colon ca. HT29 


0.0 


Prostate ca * (bone met) PC-3 


0.0 


Colon ca. HCT-116 


0.0 


Prostate Pool 


0.0 


Colon ca. CaCo-2 


0.0 
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Placenta 


0.0 


Colon cmcMUr/USDE/ 




Uterus Pool 


0.0 


Colon ca.SWl 116 


0.0 


Ovarian ca. OVCAR-3 


0.0 


Colon ca. Colo-205 


0.0 


Ovarian ca. SK-OV-3 


0.0 


Colon ca. SW-48 


0.0 


Ovarian ca. OVCAR-4 


0.0 


Colon Pool 


0.2 


Ovarian ca. OVCAR-5 


0.1 


Small Intestine Pool 


0.1 


Ovarian ca, IGROV-1 


0.0 


Stomach Pool 


0.3 


Ovarian ca. OVCAR-8 


0.0 


Bone Marrow Pool 


0.1 


Ovary 


0.2 


Fetal Heart 


0.3 


Breast ca. MCF-7 


0.7 


Heart Pool 


0.0 


Breast ca. MDA-MB-231 


0.0 


Lymph Node Pool 


0.5 


Breast ca. BT 549 


0.0 


Fetal Skeletal Muscle 


0.0 


Breast ca. T47D 


0.0 


Skeletal Muscle Pool 


0.0 


Breast ca. MDA-N 


0.0 


Spleen Pool 


0.6 


Breast Pool 


0.2 


Thymus Pool 


0.5 


Trachea 


0.3 


CNS cancer (glio/astro) U87-MG 


0.0 


Lung 


0.2 


CNS cancer (glio/astro) TJ-1 18-MG 


0.0 


Fetal Lung 


0.0 


CNS cancer (neuro;met) SK-N-AS 


2.2 


Lungca. NCI-N417 


0.0 


CNS cancer (astro) SF-539 


0.0 


Lung ca. LX-1 


0.0 


CNS cancer (astro) SNB-75 


0.7 


Lungca. NCI-H146 


0.0 


CNS cancer (glio) SNB-19 


0.0 


Lung ca. SHP-77 


100.0 


CNS cancer (glio) SF-295 


0.1 


Lung ca. A549 


0.0 


Brain (Amygdala) Pool 


3.7 


Lungca. NCI-H526 


0.4 


Brain (cerebellum) 


8.8 


Lungca. NCI-H23 


0.2 


Brain (fetal) 


16.2 


Lung ca. NCI-H460 


0.1 


Brain (Hippocampus) Pool 


3.6 


Lung ca. HOP-62 


0.0 


Cerebral Cortex Pool 


3.7 


Lungca. NCI-H522 


1.4 


Brain (Substantia nigra) Pool 


4.6 


Liver 


0.0 


Brain (Thalamus) Pool 


5.0 


Fetal Liver 


0.0 


Brain (whole) 


4.5 


Liver ca. HepG2 


0.0 


Spinal Cord Pool 


4.7 


Kidney Pool 


0.0 


Adrenal Gland 


0.2 


Fetal Kidney 


0.0 


Pituitary gland Pool 


8.0 


Renal ca. 786-0 


0.0 


Salivary Gland 


0.0 


Renal ca. A498 


0.0 


Thyroid (female) 


0.2 


Renal ca. ACHN 


0.0 


Pancreatic ca. CAPAN2 


0.1 


Renal ca.UO-31 jO.O 


Pancreas Pool 


0.2 



General jscreening_paneLvl.6 Summary: Ag6854 Highest expression of this 
gene is seen in a lung cancer cell line (CT=27.8). Thus, expression of this gene could be 
used to differentiate between this sample and other samples on this panel and as a marker to 
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detect the presence of lung cancer. Furthermore, therape^ti^nlbdumHotMPfli^ ejcJntssM 
or function of this gene may be effective in the treatment of lung cancer. 

This gene is also expressed at moderate to low levels in the CNS, including the 
hippocampus, thalamus, substantia nigra, amygdala, cerebellum and cerebral cortex, 
5 Therefore, therapeutic modulation of the expression or function of this gene may be useful 
in the treatment of neurological disorders, such as Alzheimer's disease, Parkinson's disease, 
schizophrenia, multiple sclerosis, stroke and epilepsy. 

AB. CG149008-01: NOVEL SODIUM/HYDROGEN 
EXCHANGER FAMILY MEMBER. 

10 Expression of gene CG149008-01 was assessed using the primer-probe set Ag5630, 

described in Table ABA. Results of the RTQ-PCR runs are shown in Tables ABB, ABC, 
ABD and ABE. 

Table ABA. Probe Name Ag5630 

15 



Primers 


Sequencs 


Length 


Start 
Position 


SEQID 
No 


Forward 


5 1 -tattttctgggtcaggctgat-3 ' 


21 


770 


348 


Probe 


TET-5 ' -tctctaaactcaacatgacagacagtt 
ttg-3 1 -TAMRA 


30 


795 


349 


Reverse 


5 ' -cagatattagggagccaaacg-3 ■ 


21 


825 


350 



Table ABB. CNS neurodegeneration vl.O 

20 



Tissue Name 


ReL 

Exp.(%) 
Ag5630, 
Run 

246956911 


issue Name 


ReL 

Exp.(%) 
Ag5630, 
Run 

246956911 


AD 1 Hippo 


9.3 


Control (Path) 3 Temporal Ctx 


9.3 


AD 2 Hippo 


31.4 


Control (Path) 4 Temporal Ctx 


14.5 


AD 3 Hippo 


5.5 


AD 1 Occipital Ctx 


7.5 


AD 4 Hippo 


8.4 


AD 2 Occipital Ctx (Missing) 


0.0 


AD 5 hippo 


62.0 


AD 3 Occipital Ctx 


4.5 


AD 6 Hippo 


46.0 


AD 4 Occipital Ctx 


18.9 


Control 2 Hippo 


31.4 


AD 5 Occipital Ctx 


13.9 


Control 4 Hippo 


15.9 


AD 6 Occipital Ctx 


46.3 


Control (Path) 3 Hippo 


10.4 


Control 1 Occipital Ctx 


3.8 
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AD 1 Temporal Ctx 


12.0 


Control 2 tfccrpi^^^ 


NBt "11 "iHP "Til 1 "53 


AD 2 Temporal Ox 


41.8 


Control 3 Occipital Ctx 


6.1 


AD 3 Temporal Ctx 


2.3 


Control 4 Occipital Ctx 


13.2 


AD 4 Temporal Ctx 


25.7 


Control (Path) 1 Occipital Ctx 


62.0 


AD 5 Inf Temporal Ctx 


100.0 


Control (Path) 2 Occipital Ctx 


10.5 


AD 5 SupTemporal Ctx 


48.6 


Control (Path) 3 Occipital Ctx 


8.4 


AD 6 Inf Temporal Ctx 


36.9 


Control (Path) 4 Occipital Ctx 


11.8 


AD 6 Sup Temporal Ctx 


45.7 


Control 1 Parietal Ctx 


10.4 


Control 1 Temporal Ctx 


14.3 


Control 2 Parietal Ctx 


49.0 


Control 2 Temporal Ctx 


48.6 


Control 3 Parietal Ctx 


20.3 


Control 3 Temporal Ctx 


12.8 


Control (Path) 1 Parietal Ctx 


44.1 1 


Control 4 Temporal Qx 


14.1 


Control (Path) 2 Parietal Ctx 


22.7 


Control (Path) 1 Temporal Ctx 


52.5 


Control (Path) 3 Parietal Ctx 


8.2 


Control (Path) 2 Temporal Ctx 


33.9 


Control (Path) 4 Parietal Ctx 


35.1 



Table ABC General screening panel v!5 



Tissue Name 


ReL 

Exp.(%) 
Ag5630, 
Run 

245065625 


issue Name 


ReL 

Exp.(%) 
Ag5630, 
Run 

245065625 


Adipose 


4.2 


Renal ca. TK-10 


32.8 


Melanoma* Hs688(A).T 


21.9 


Bladder 


9.5 


Melanoma* Hs688(B).T 


19.2 


Gastric ca. (liver met.) NCI-N87 


100.0 


Melanoma* M14 


41.2 


Gastric ca. KATO IH 


52.1 


Melanoma* LOX1MV1 


25.2 


Colon ca. SW-948 


5.1 


Melanoma* SK-MEL-5 


20.0 


Colon ca. SW480 


27.2 


Squamous cell carcinoma SCC-4 


8.4 


Colon ca.* (SW480 met) SW620 


22.2 


Testis Pool 


9.1 


Colon ca. HT29 


10.5 


Prostate ca.* (bone met) PC-3 


5.8 


Colon ca.HCT-116 


15.6 


Prostate Pool 


3.0 


Colon ca. CaCo-2 


25.9 


Placenta 


16.7 


Colon cancer tissue 


12.9 


Uterus Pool 


4.3 


Colon ca. SW1116 


3.4 


Ovarian ca. OVCAR-3 


35.6 


Colon ca. Colo-205 


19.8 


Ovarian ca. SK-OV-3 


15.4 


Colon ca. SW-48 


12.6 


Ovarian ca. OVCAR^4 


9.5 


Colon Pool 


6.4 


Ovarian ca. OVCAR-5 


44.8 


Small Intestine Pool 


4.0 


Ovarian ca. IGROV-1 


13.9 


Stomach Pool 


3.7 


Ovarian ca. OVCAR-8 


8.0 


Bone Marrow Pool 


2.9 


Ovary 


3.8 


Fetal Heart 


4.1 


Breast ca. MCF-7 


14.9 


Heart Pool 


3.3 


Breast ca. MDA-MB-231 


25.2 


Lymph Node Pool 


6.8 
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Breast ca. BT 549 


32.1 


Fetal Skdefif^U 9018 ^ 


^§1373 


Breast ca. T47D 


18.7 


Skeletal Muscle Pool 


15.6 


Breast ca. MDA-N 


9.3 


Spleen Pool 


5.4 


Breast Pool 


1.7 


Thymus Pool 


7.6 


Trachea 


18.4 


CNS cancer (glio/astro) U87-MG 


74.2 


Lung 


1.7 


CNS cancer (glio/astro) U-l 18-MG 


34.4 


Fetal Lung 


9.2 


CNS cancer (neuro;met) SK-N-AS 


8.5 


Lung ca. NCI-N417 


4.8 


CNS cancer (astro) SF-539 


11.9 


Lung ca. LX-1 


24.1 


CNS cancer (astro) SNB-75 


43.2 


Lung ca. NCI-H146 


3.6 


CNS cancer (glio) SNB-19 


12.9 


Lung ca. SHP-77 


14.0 


CNS cancer (glio) SF-295 


30.8 


Lung ca. A549 


35.4 


Brain (Amygdala) Pool 


4.9 


Lungca.NCI-H526 


3.5 


Brain (cerebellum) 


23.7 


Lungca. NCI-H23 


23.5 


Brain (fetal) 


6.5 


Lungca. NCI-H460 


6.7 


Brain (Hippocampus) Pool 


7.5 


Lung ca. HOP-62 


7.6 


Cerebral Cortex Pool 


5.3 


Lungca. NCI-H522 


8.5 


Brain (Substantia nigra) Pool 


4.3 


Liver 


4.2 


Brain (Thalamus) Pool 


7.4 


Fetal Liver 


15,8 


Brain (whole) 


5.4 


Liver ca. HepG2 


5.7 


Spinal Cord Pool 


6.4 


Kidney Pool 


7.7 


Adrenal Gland 


24.1 


Fetal Kidney 


5.0 


Pituitary gland Pool 


3.1 


Renal ca. 786-0 


19.9 


Salivary Gland 


13.2 j 


Renal ca. A498 


14.3 


Thyroid (female) 


8.1 ! 


Renal ca. ACHN 


8.9 


Pancreatic ca. CAPAN2 


26.1 


Renal ca. UO-31 


32.1 


Pancreas Pool 


9.3 



Table ABD. Panel 4.1D 



Tissue Name 


Rel. 
Exp.(% 
Ag5630, 
Run 

246490808 


Tissue Name 


ReL 

Exp.(%) 
Ag5630, 
Run 

246490808 


Secondary Thl act 


52.9 


HUVEC IL-lbeta 


21.9 


Secondary Th2 act 


86.5 


HUVEC 1FN gamma 


20.2 


Secondary Trl act 


14.5 


HUVEC TNF alpha + IFN gamma 


6.7 


Secondary Thl rest 


2.2 


HUVEC TNF alpha + JL4 


4.6 


Secondary Th2 rest 


1.7 


HUVEC IL-11 


12.6 


Secondary Trl rest 


0.0 


Lung Microvascular EC none 


31.6 


Primary Thl act 


0.8 


Lung Microvascular EC TNFalpha 
+ IL-lbeta 


9.4 


Primary Th2 act 


42.6 


Microvascular Dermal EC none 


0.7 



433 



WO 03/029424 



PCT/US02/31373 



Primary Trl act 


35.4 


MicrosvasBlarDermareP *- p B ~ * 
TNFalpha + IL-lbeta 


•—V -4J., -tnF- J>~ '*»— 

7.2 


Primary Thl rest 


1.9 


Bronchial epithelium TNFalpha + 
TT lheta 


4.2 






QttiaII flirwflV f*r*itHp1itim nnnp 
OIIK1J1 aJl Way QUXUIdlUlIl nunc 




Primary Trl rest 


0.3 


^mall Jiirwav pnithplinm 'I'I\IlRjilT>h!i 
OlilaJl <IJ1 WAV CUJU1CJJU1J1 ki^rallJlul 

+ IL-lbeta 


29.1 




JU.U 




Q Q 


CD45RO CD4 lymphocyte act 


49.3 


Coronery artery SMC TNFalpha + 


13.3 


CD8 lymphocyte act 


4.6 


Astrocytes rest 


2.6 


Secondary CD8 lymphocyte rest 


29.9 


Astrocytes TNFalpha + IL-lbeta 


4.2 


Secondary CD8 lymphocyte act 


6.6 


KU-812 (Basophil) rest 


4.9 


CD4 lymphocyte none 


0.0 


KU-8 12 (Basophil) 
PM A/i onomy ci n 


11.9 


zjy kill/ 1 \\£J I II oliu-v^x/y j 

CH11 


2.5 


CCD1 106 (Keratinocytes) none 


28.3 


LAK cells rest 


11.1 


CCD1106 (Keratinocytes) 
TNFalpha + IL-lbeta 


18.6 


LAK cells IL-2 


9.7 


Liver cirrhosis 


4.6 


LAK cells IL-2+DL-12 


2.3 


NCI-H292 none 


463 


LAK cells IL-2+IFN gamma 


17.3 


NCI-H292 DL-4 


46.0 


LAK cells IL-2+ IL-18 


9.5 


NCI-H292 IL-9 


69.3 


LAK cells PMA/ionomycin 


36.3 


NCI-H292 IL-13 


59.0 


NK Cells IL-2 rest 


17.0 


NCI-H292 IFN gamma 


33.9 


Two Way MLR 3 day 


9.4 


HPAEC none 


12.9 


Two Way MLR 5 day 


1.0 


HPAEC TNF alpha + IL-1 beta 


70.2 


Two Way MLR 7 day 


7.0 


Lung fibroblast none 


14.2 


PBMC rest 


0.9 


Lung fibroblast TNF alpha + IL-1 
beta 


20.0 


PBMCPWM 


9.9 


Lung fibroblast DL-4 


12.4 


rJDMv^ rn/i-L 


o.*f 


T l rr ■fi t-vrrVlil f* r»+ TT O 

L-Ung llDrODJdSl \XJ-y 


A R 


Ramos (B cell) none 


1.4 


Lung fibroblast BL-13 


2.7 


Ramos (B cell) ionomycin 


28.5 


Lung fibroblast IFN gamma 


27.7 


B lymphocytes PWM 


19.6 


Dermal fibroblast CCD1070 rest 


33.9 


B lymphocytes CD40L and IL-4 


28.1 


Dermal fibroblast CCD1070 TNF 
alpha 


62.4 


EOL-1 dbcAMP 


3.8 


DatttuiI fiKrrtWact PfDIfnO TT -1 

L-'CI IlJdJ IlDIUDlaol L-V^L/IU/U ii->~ 1 

beta 


18.3 


EOL-1 dbcAMP 
PMA/ionomycin 


0.4 


Dermal fibroblast IFN gamma 


19.3 


Dendritic cells none 


9.2 


Dermal fibroblast IL-4 


37.4 


Dendritic cells LPS 


3.2 


Dermal Fibroblasts rest 


15.8 


Dendritic cells anti-CD40 


3.8 


Neutrophils TNFa+LPS 


37.6 


Monocytes rest 


0.0 


Neutrophils rest 


41.2 


Monocytes LPS 


100.0 


Colon 


1.5 
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Macrophages rest 



6.0 



Lung 



Macrophages LPS 



10.6 



Thymus 



2.4 



HUVEC none 



12.6 



Kidney 



17.2 



HUVEC starved 



21.5 



Table ABE. Panel 5 Islet 

5 



Tissue Name 


Rel. 
Exp.(% 
Ag5630, 
Run 


Tissue Name 


Rel. 

Exp.(%) 
Ag5630, 
Run 

JtlZFO / VOW 


97457 JPatient-02go_adipose 


15.5 


94709_Donor 2 AM - A_adipose 


26.6 


97476J > atient-07sk w skeletal 
muscle 


0.0 


947 10 _J)onor 2 AM - B.adipose 


21.0 


97477 JPatient-07uUiterus 


5.0 


9471 l_Donor 2 AM - C.adipose 


16.7 


97478_Patient-07pl_placenta 


9.3 


94712JDonor 2 AD - A^adipose 


55.9 


99167__Bayer Patient 1 


100,0 


94713JDonor 2 AD - B_adipose 


74.7 


97482_Patient-08ut_uterus 


11.0 


947 14 JDonor 2 AD - C_adipose 


54.7 


97483 JPatient-08pLpIacenta 


7.9 


94742 JDonor 3 U - A Mesenchymal 
Stem Cells ! 


5.7 


97486_Patient-09sleskeletal 
muscle 


9.9 


94743_Donor 3 U - B Jtf esenchymal 
Stem Cells 


8.0 


97487_Patient-09uUiterus 


4.1 


94730 JDonor 3 AM - A.adipose 


8.3 


97488 JPatient-09pLplacenta 


10.3 


94731_Donor 3 AM - B.adipose 


14.3 


97492 JPatient- 1 0ut_utems 


10.2 


94732_Donor 3 AM - C_adipose 


11.3 


97493_Patient-10pLplacenta 


20.9 


94733_Donor 3 AD - A_adipose 


30.1 


97495_Patient-l lgo_adipose 


5.8 


94734 JDonor 3 AD - B_adipose 


22.5 


97496JPatient-l lsk_skeletal 
muscle 


4.4 


94735_Donor 3 AD - C_adipose 


7.5 


97497 JPatient-1 luLuterus 


13.5 


77 138 JJverJ£epG2untreated 


2.5 


97498_Patient-l lpLplacenta 


3.4 


73556_Heart_Cardiac stromal cells 
(primary) 


2.7 


97500JPatient-12go_adipose 


37.1 


81735_Smal! Intestine 


12.6 


9750UPatient-12sK.skeIetal 
muscle 


20.2 


72409 JKidney_Proximal Convoluted 
Tubule 


28.1 


97502 J^tient-l 2ut_uterus 


22.8 


82685_SmaIl intestine JDuodenum 


24.0 


97503J > atient-12pl_placenta 


13.1 


90650_AdrenaLAdrenocortical 
adenoma 


7.3 


94721_Donor2U- 
A_Mesenchyrnal Stem Cells 


87.7 


72410JCidney_HRCE 


33.0 


94722_Donor 2 U - 
B_Mesenchymal Stem Cells 


75.8 


724H.Kidney.HRE 


10.4 


94723 J)onor 2 U - 
C_Mesenchymal Stem Cells 


77.9 


73139_Utenis_Uterine smooth 
muscle cells 


11.8 
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CNSjneurodegeneration_vl.O Summary: Ag5630This panel confirms the 
expression of this gene at low levels in the brains of an independent group of individuals. 
5 However, no differential expression of this gene was detected between Alzheimer's 

diseased postmortem brains and those of non-demented controls in this experiment. Please 
see Panel 1.5 for a discussion of this gene in treatment of central nervous system disorders. 

GeneraLscreening_panel_vl.5 Summary: Ag5630 Higest expression of this 
gene is detected in a gastric cancer NCI-N87 cell line (CT=27.6). Moderate levels of 
10 expression of this gene is also seen in cluster of cancer cell lines derived from pancreatic, 
gastric, colon, lung, liver, renal, breast, ovarian, prostate, squamous cell carcinoma, 
melanoma and brain cancers. Thus, expression of this gene could be used as a marker to 
detect the presence of these cancers. Furthermore, therapeutic modulation of the expression 
or function of this gene may be effective in the treatment of pancreatic, gastric, colon, lung, 
15 liver, renal, breast, ovarian, prostate, squamous cell carcinoma, melanoma and brain 
cancers. 

Among tissues with metabolic or endocrine function, this gene is expressed at 
moderate levels in pancreas, adipose, adrenal gland, thyroid, pituitary gland, skeletal 
muscle, heart, liver and the gastrointestinal tract. Therefore, therapeutic modulation of the 
20 activity of this gene may prove useful in the treatment of endocrine/metabolically related 
diseases, such as obesity and diabetes. 

In addition, this gene is expressed at moderate levels in all regions of the central 
nervous system examined, including amygdala, hippocampus, substantia nigra, thalamus, 
cerebellum, cerebral cortex, and spinal cord. Therefore, therapeutic modulation of this gene 
25 product may be useful in the treatment of central nervous system disorders such as 

Alzheimer's disease, Parkinson's disease, epilepsy, multiple sclerosis, schizophrenia and 
depression. 

Panel 4.1D Summary: Ag5630 Higest expression of this gene is detected in LPS 
treated monocytes (CT=29.7). Interestingly, this gene is expressed at much higher levels in 
30 LPS activated when compared to resting monocytes (CT=40). This observation suggests 
that expression of this gene can be used to distinguish actvated from resting monocytes. In 
addition, upon activation monocytes contribute to the innate and specific immunity by 
migrating to the site of tissue injury and releasing inflammatory cytokines. This release 
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contributes to the inflammation process. Therefore, modtililrjori 6f We^p^ioffbf tBe * " 
protein encoded by this gene may prevent the recruitment of monocytes and the initiation 
of the inflammatory process. 

In addition, this gene is also expressed at moderate to low levels in activated 
5 polarized T cells, naive and memory T cells, resting and activated LAK cells, resting DL-2 
treated NK cells, two way MLR, activated PBMC cells and B lymphocytes, dendritic cells, 
macrophage, different endothelial cells, bronchial and small airway epithelium, astrocytes, 
basophils, keratinocytes, mucoepidermoid cells, lung and dermal fibroblasts, neutrophils 
and kidney. Therefore, modulation of the gene product with a functional therapeutic may 
10 lead to the alteration of functions associated with these cell types and lead to improvement 
of the symptoms of patients suffering from autoimmune and inflammatory diseases such as 
asthma, allergies, inflammatory bowel disease, lupus erythematosus, psoriasis, rheumatoid 
arthritis, and osteoarthritis. 

Panel 5 Islet Summary: Ag5630 Higest expression of this gene is detected in beta 
15 islet cells (CT=26.7). In addition, this gene shows widespread expression in this panel, with 
moderate to low expressions in adipose, placenta, uterus, skeletal muscle, kidney, and small 
intestine samples. Therefore, therapeutic modulation of this gene may be useful in the 
treatment of metabolic/endocrine disorders including, obesity, Type I and II diabetes. 

AC. CG149350-01 and CG149350-02: Vacuolar ATP synthase 
20 subunitF. 

Expression of gene CG149350-01 and CG149350-02 was assessed using the 
primer-probe set Ag7581, described in Table ACA. Results of the RTQ-PCR runs are 
shown in Table ACB. Please note that CG149350-02 represents a full-length physical clone 
of the CG149350-01 gene, validating the prediction of the gene sequence. 
25 Table ACA. Probe Name Ag7581 



Primers 




Length 


Start 
Position 


SEQ ID 
No 


Forward 


5 ' -aagaactgccaccccaatt-3 ' 


19 


88 


351 


Probe 


TET-5 ' -cattgatggtcgtatccttctccacc 
a-3 ' -TAMRA 


27 


113 


352 


Reverse 


5 ' -aaattgccggaaagtgtctt-3 ' 


20 


146 


353 
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Table ACB. CNS neurodegeneratlon vl.O 



1 J abut; i^uiJic 


Rel. 

T7i /Of \ 

Exp.(%) 

fig / k'Ul y 

Run 

308752174 


iccup NamA 
laouc iiduic 


Rel. 

T? / rrf \ 

Exp.(%) 
Run 

308752174 


AD 1 Hippo 


19.9 


Control (Path) 3 Temporal Ctx 


7.3 


AD 2 Hippo 


21.3 


Control (Path) 4 Temporal Ctx 


62.9 


AD 3 Hippo 


14.9 


AD 1 Occipital Ctx 


19.1 


AD 4 Hippo 


6.4 


AD 2 Occipital Ctx (Missing) 


0.0 


AD 5 hippo 


65.5 


AD 3 Occipital Ctx 


22.4 


AD 6 Hippo 


44.4 


AD 4 Occipital Ctx 


32.3 


Control 2 Hippo 


21.9 


AD 5 Occipital Ctx 


4.4 


Control 4 Hippo 


30.6 


AD 6 Occipital Ctx 


20.2 


Control (Path) 3 Hippo 


10.7 


Control 1 Occipital Ctx 


3.0 


AD 1 Temporal Ctx 


23.0 


Control 2 Occipital Ctx 


35.6 


AD 2 Temporal Ctx 


27.5 


Control 3 Occipital Ctx 


53.2 


AD 3 Temporal Ctx 


19.8 


Control 4 Occipital Ctx 


6.8 


AD 4 Temporal Ctx 


21.3 


Control (Path) 1 Occipital Ctx 


70.7 


AD 5 Inf Temporal Ctx 


46.3 


Control (Path) 2 Occipital Ctx 


17.9 


AD 5 SupTemporal Ctx 


55.9 


Control (Path) 3 Occipital Ctx 


4.2 


AD 6 Inf Temporal Ctx 


52.9 


Control (Path) 4 Occipital Ctx 


32.5 


AD 6 Sup Temporal Ctx 


47.3 


Control 1 Parietal Ctx 


8.7 


Control 1 Temporal Ctx 


23.5 


Control 2 Parietal Ctx 


56.3 


Control 2 Temporal Ctx 


28.9 


Control 3 Parietal Ctx 


32.5 


Control 3 Temporal Ctx 


22.2 


Control (Path) 1 Parietal Ctx 


100.0 


Control 4 Temporal Ctx 


9.1 


Control (Path) 2 Parietal Ctx 


38.4 


Control (Path) 1 Temporal Ctx 


45.7 


Control (Path) 3 Parietal Ctx 


17.6 


Control (Path) 2 Temporal Ctx 


62.0 


Control (Path) 4 Parietal Ctx 


64.2 



5 

CNS_neurodegeneration_vl.O Summary: Ag7581 No differential expression of 
this gene was detected between Alzheimer's diseased postmortem brains and those of 
non-demented controls in this experiment. However, this panel confirms the expression of 
this gene at low levels in the brains of an independent group of individuals. Therefore, 
10 therapeutic modulation of this gene product may be useful in the treatment of central 
nervous system disorders such as Parkinson's disease, epilepsy, multiple sclerosis, 
schizophrenia and depression. 
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AD. CG149536-01: PROTEIN-TYRO^i ^MO^IffisE? * 3 7 1 
NON RECEPTOR TYPE 2. 

Expression of gene CG149536-01 was assessed using the primer-probe sets Ag5255 
and Ag6844, described in Tables ADA and ADB. Results of the RTQ-PCR runs are shown 
5 in Tables ADC, ADD and ADE. 

Table ADA. Probe Name Ag5255 



Primers 




Length 


Start 
Position 


SEQID 

No | 


Forward 


5 ' -cttatggtttggcagcagaa-3 ' 


20 


355 


354 


Probe 


TET-5 1 -ccaaagcagttgtcatgctgaaccgc 
-3 • -TAMRA 


26 


377 


355 


Reverse 


5 1 -tggtttcaccactcgattct-3 1 


20 


414 


356 



10 

Table ADB. Probe Name Ag6844 



Primers 




Length 


Start 
Position 


SEQID 
No 


Forward 


5 ' -agagaatcgagtggtgaaacc-3 ' 


21 


412 


357 


Probe 


TET-5 ' -actacctggccagattttggagtccc 
- 3 ' -TAMRA 


26 


457 


358 


Reverse 


5 * -aggagccagattctctcacttta-3 ' 


23 


516 


359 



15 

Table ADC. CNS neurodeeeneratjon vl.0 



Tissue Name 


ReL 

Exp.(%) 
Ag5255, 
Run 

229929883 


issue Name 


Rel. 

Exp.(%) 
Ag5255, 
Run 

229929883 


AD 1 Hippo 


28.9 


Control (Path) 3 Temporal Ctx 


21.0 


AD 2 Hippo 


42.3 


Control (Path) 4 Temporal Ctx 


38.7 


AD 3 Hippo 


42.0 


AD 1 Occipital Ctx 


45.4 


AD 4 Hippo 


5.9 


AD 2 Occipital Ctx (Missing) 


0.0 


AD 5 hippo 


92.7 


AD 3 Occipital Ctx 


36.9 


AD 6 Hippo 


29.7 


AD 4 Occipital Ctx 


23.5 


Control 2 Hippo 


52.5 


AD 5 Occipital Ctx 


13.6 


Control 4 Hippo 


22.4 


AD 6 Occipital Ctx 


47.6 


Control (Path) 3 Hippo 


17.9 


Control 1 Occipital Ctx 


3.2 
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AD 1 Temporal Ctx 


39.5 


Control 2 EdQA fC6C - U "~ ' 


JMyJ! mjUm >»<ill .rt «- 


AD 2 Temporal Ctx 


56.3 


Control 3 Occipital Ctx 


31.2 


AD 3 Temporal Ctx 


23.3 


Control 4 Occipital Ctx 


5.0 


AD 4 Temporal Ctx 


10.9 


Control (Path) 1 Occipital Ctx 


99.3 


AD 5 lnf Temporal Ctx 


44.8 


Control (Path) 2 Occipital Ctx 


40.3 


AD 5 SupTemporal Ctx 


53.2 


Control (Path) 3 Occipital Ctx 


0.0 


AD 6 lnf Temporal Ctx 


68.8 


Control (Path) 4 Occipital Ctx 


24.0 


AD 6 Sup Temporal Ctx 


100.0 


Control 1 Parietal Ctx 


20.6 


Control 1 Temporal Ctx 


13.4 


Control 2 Parietal Ctx 


68.3 


Control 2 Temporal Ctx 


34.4 


Control 3 Parietal Ctx 


29.5 


Control 3 Temporal Ctx 


84.1 


Control (Path) 1 Parietal Ctx 


46.3 


Control 4 Temporal Ctx 


18.4 


Control (Path) 2 Parietal Ctx 


31.2 


Control (Path) 1 Temporal Ctx 


41.2 


Control (Path) 3 Parietal Ctx 


6.9 


Control (Path) 2 Temporal Ctx 


58.6 


Control (Path) 4 Parietal Ctx 


45.1 



Table ADD. General screening panel vl.5 

5 



Tissue Name 


Rel. 

Exp.(%) 
Ag5255, 
Run 

230218532 


issue Name 


Rel. 

Exp.(%) 
Ag5255, 
Run 

230218532 


Adipose 


6.4 


Renal ca. TK-10 


18.8 


Melanoma* Hs688(A).T 


9.5 


Bladder 


10.8 


Melanoma* Hs688(B).T 


8.7 


Gastric ca. (liver met.) NCI-N87 


50.3 


Melanoma* M14 


19.1 _ 


Gastric ca. KATO III 


60.3 


Melanoma* LOXMVI 


25.5 


Colon ca. SW-948 


5.8 


Melanoma* SK-MEL-5 


18.8 


Colon ca. SW480 


100.0 


Squamous cell carcinoma SCC-4 


24.0 


Colon ca * (S W480 met) SW620 


23.3 


Testis Pool 


2.2 


Colon ca. HT29 


19.2 


Prostate ca.* (bone met) PC-3 


33.9 


Colon ca. HCT-116 


46.7 


Prostate Pool 


4.1 


Colon ca. CaCo-2 


49.3 


Placenta 


1.9 


Colon cancer tissue 


5.7 


Uterus Pool 


2.3 


Colon ca.SWl 116 


3.5 


Ovarian ca. OVCAR-3 


19.6 


Colon ca. Colo-205 


3.3 


Ovarian ca. SK-OV-3 


55.5 


Colon ca. SW-48 


0.5 


Ovarian ca. OVCAR-4 


8.5 


Colon Pool 1 


5.9 


Ovarian csl OVCAR-5 


44.4 


Small Intestine Pool 


5.7 


Ovarian ca. IGROV-1 


5.7 


Stomach Pool 


3.2 


Ovarian ca. OVCAR-8 


7.8 


Bone Marrow Pool 


2.8 


Ovary 


8.0 


Fetal Heart 


3.7 


Breast ca. MCF-7 


38.2 


Heart Pool 


0.7 


Breast ca. MDA-MB-231 


13.4 


Lymph Node Pool 


4.1 
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Breast ca. BT 549 


51.8 


Fetal Ske1etMslir ,,UQOa -' 




Breast ca. T47D 


5.4 


Skeletal Muscle Pool 


2.6 


Breast ca. MDA-N 


7.0 


Spleen Pool 


0.4 


Breast Pool 


9.0 


Thymus Pool 


19.2 


Trachea 


1.0 


CNS cancer (glio/astro) U87-MG 


26.4 


Lung 


5.7 


CNS cancer (glio/astro) U-118-MG 


33.2 


Fetal Lung 


17.1 


CNS cancer (neuro;met) SK-N-AS 


18.9 


Lung ca. NCI-N417 


1.0 


CNS cancer (astro) SF-539 


17.1 


Lung ca. LX-1 


12.6 


CNS cancer (astro) SNB-75 


12.2 ] 


Lungca. NCI-H146 


16.6 


CNS cancer (glio) SNB-19 


6.4 


Lung ca. SHP-77 


34.6 


CNS cancer (glio) SF-295 


16.0 


Lung ca. A549 


15.1 


Brain (Amygdala) Pool 


4.0 


Lung ca. NCI-H526 


6.7 


Brain (cerebellum) 


33.2 


Lungca. NCI-H23 


33.0 


Brain (fetal) 


54.0 


Lungca. NCI-H460 


7.2 


Brain (Hippocampus) Pool 


4.7 


Lung ca. HOP-62 


26.2 


Cerebral Cortex Pool 


5.3 


Lungca. NCI-H522 


35.1 


Brain (Substantia nigra) Pool 


4.0 


Liver 


0.9 


Brain (Thalamus) Pool 


6.8 


Fetal Liver 


7.2 


Brain (whole) 


4.9 


Liver ca. HepG2 


9.7 


Spinal Cord Pool 


17.0 


Kidney Pool 


7.3 


Adrenal Gland 


2.4 


Fetal Kidney 


16.3 


Pituitary gland Pool 


2.1 


Renal ca. 786-0 


7.1 


Salivary Gland 


1.5 


Renal ca. A498 


2.2 


Thyroid (female) 


LI 1 


Renal ca. ACHN 


9.2 


Pancreatic ca. CAPAN2 


66.4 


Renal ca. UO-31 


6.5 


Pancreas Pool 


7.2 



Table APE. Panel 4.1D 



5 



Tissue Name 


Rel. 

Exp.(%) 

g5255, 

Run 

229851730 


ReL 

Exp.(%) 
Ag6844, 
Run 

279029113 


Tissue Name 


ReL 

Exp.(%) 
Ag5255, 
Run 

229851730 


ReL 

Exp.(%) 
Ag6844, 
Run 

279029113 


Secondary Thl act 


39.0 


38.7 


HUVEC EL-lbeta 


39.8 


9.6 


Secondary Th2 act 


46.7 


55.9 


HUVEC IFN gamma 


12.5 


15.9 


Secondary Trl act 


15.7 


18.9 


HUVEC TNF alpha* 
IFN gamma 


21.0 


8.4 


Secondary Thl rest 


12.0 


3.9 


HUVEC TNF alpha + 
IL4 


12.1 


11.0 


Secondary Th2 rest 


0.0 


5.3 


HUVEC EL-11 


13.6 


4.4 


Secondary Trl rest 


0.0 


9.2 


Lung Microvascular 
EC none 


25.2 


18.4 
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Primary Thl act 


17.9 


6.0 


PC-T/'-lii 

Lung Microvascular 

EC TNFalpha + 

IL-lbeta 


isoa/: 

2.6 


'Jt> »tt ^.U -S 1 "*3 

9.4 


Primary Th2 act 


15.0 


33.7 


Microvascular 
Dermal EC none 


6.0 


3.8 


Primary Trl act 


18.2 


22.7 


Microsvasular 
Dermal EC 
TNFalpha + IL-lbeta 


0.0 


3.7 


Primary Th 1 rest 


0.0 


1.9 


Bronchial epithelium 
TNFalpha + ELI beta 


9.3 


10.2 


Primary Th2 rest 


5.0 


1.5 


Small airway 
epithelium none 


0.0 


10.0 


Primary Trl rest 


0.0 


0.0 


Small airway 
epithelium TNFalpha 

i TT lUpfa 


37.1 


14.1 


CD45RA CD4 
lymphocyte act 


32.1 


13.9 1 


Coronery artery SMC 
rest 


11.1 


5.5 


CD45ROCD4 
lymphocyte act 


58.6 


42.9 


Coronery artery SMC 
l iNraipna + il- i oeia 


11.3 


4.0 


CD8 lymphocyte act 


5.2 


18.7 


Astrocytes rest 


0.0 


1.1 


Secondary CD8 
lymphocyte rest 


10.9 


5.5 


Astrocytes TNFalpha 
+ IL-lbeta 


0.0 


1.8 


Secondary CD8 
lymphocyte act 


0.0 


4.4 


KU-812 (Basophil) 
rest 


38.4 


17,2 


PT)4 Ivmohocvte none 


6.7 


3.4 


KU-812 (Basophil) 
PMA/ionomycin 


33.2 


38.7 


2ry 

Th 1/Th2/Tr l_anti-CD95 
CHI 1 


0.0 


26.4 


CCD1106 

(Keratinocytes) none 


76.3 


40.1 


LAK cells rest 


19.1 


14.7 


(Keratinocytes) 
TNFalpha + IL-lbeta 


13.1 


14.9 


LAK cells IL-2 


5.4 


7.3 


Liver cirrhosis 


15.8 


7.0 


T A V «»11r, IT 1 , TT 11 

LAK cells IL-z+lL- 1 L 




1 n 
l.U 


iH\^,i-ri£,y6 none 


1 




LAK cells lL-2+lhM 
gamma 


16.2 


7.7 


NCI-H292 IL-4 


45.4 


25.5 


T A TT ..II- TT 1 | TT 1 Q 

LAK Cells lL-v£+ ULrlo 


J.i 


o.U 






31 2 


lak cells 
PMA/ionomycin 


27.9 


40.9 


NCI-H292 IL-13 


45.4 


38.4 


NK Cells IL-2 rest 


27.9 


40.3 


gamma 


26.2 


16.7 


Two Way MLR 3 day 


18.2 


27.0 


HPAEC none 


5.6 


6.3 


Two Way MLR 5 day 


23.3 


2.1 


HPAECTNF alpha* 
IL-1 beta 


21.5 


12.1 


Two Way MLR 7 day 


4.5 


1.7 


Lung fibroblast none 


22.5 


12.2 


PBMC rest 


3.2 


5.4 


Lung fibroblast TNF 
alpha + IL-1 beta 


6.3 


8.2 


PBMC PWM 


20.6 


9.8 


Lung fibroblast IL-4 


16.0 


13.5 
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PBMC PHA-L 


21.6 


12.1 


r — f jH:T/t 

Lung fibroblast IL-9 


1^.9 


* ±3 7 3 
if.$ 


Ramos (B cell) none 


40.3 


4.8 


Lung fibroblast EL-13 


0.0 


5.8 


Ramos (B cell) 
ionomycin 


31.6 


17.7 


Lung fibroblast DFN 
gamma 


37.6 


19.9 


B lymphocytes PWM 


26.6 


6.0 


Dermal fibroblast 
CCD1070rest 


32.3 


17.2 


B lymphocytes CD40L 
and IL^t 


4.8 


37.6 


Dermal fibroblast 
CCD1070TNF alpha 


100.0 


54.7 


EOL-1 dbcAMP 


62.9 


74.2 


Dermal fibroblast 
CCD1070IL-1 beta 


34.6 


18.7 


EOL-1 dbcAMP 
PMA/ionomycin 


45.4 


15.1 


Dermal fibroblast 
IFN gamma 


17.1 


12.7 


Dendritic cells none 


33.7 


57.0 


Dermal fibroblast 

TT A 


5.3 


15.0 


Dendritic cells LPS 


21.0 


15.2 


Dermal Fibroblasts 
rest 


0.0 


6.9 


Dendritic cells 
anti-CD40 


10.2 


7.3 


Neutrophils 
TNFa+LPS 


0.0 


2.7 


Monocytes rest 


4.3 


32.1 


Neutrophils rest 


5.6 


6.1 


Monocytes LPS 


69.7 


100.0 


Colon 


0.0 


0.9 


Macrophages rest 


17.0 


3.8 


Lung 


0.0 


1.7 


Macrophages LPS 


0.0 


9.3 


Thymus 


15.2 


18.2 


HUVEC none 


5.9 


28.7 


Kidney 


6.3 


8.7 


HUVEC starved 


28.1 


8.5 









AI_comprehensive paneLvl.O Summary: Ag5255 Expression of this gene is 
low/undetectable (CTs > 35) across all of the samples on this panel. 
5 CNS_neurodegeneration_vLO Summary: Ag5255 This panel confirms the 

expression of this gene at low levels in the brains of an independent group of individuals. 
However, no differential expression of this gene was detected between Alzheimer's 
diseased postmortem brains and those of non-demented controls in this experiment. Please 
see Panel 1.5 for a discussion of this gene in treatment of central nervous system disorders. 

10 GeneraI_screening_paneLvl.5 Summary: Ag5255 Highest expression of this 

gene is detected in a colon cancer SW480 cell line (CT=31.6). Moderate to low levels of 
expression of this gene is also seen in cluster of cancer cell lines derived from pancreatic, 
gastric, colon, lung, liver, renal, breast, ovarian, prostate, squamous cell carcinoma, 
melanoma and brain cancers. Thus, expression of this gene could be used as a marker to 

15 detect the presence of these cancers. Furthermore, therapeutic modulation of the expression 
or function of this gene may be effective in the treatment of pancreatic, gastric, colon, lung, 
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liver, renal, breast, ovarian, prostate, squamous cell carcirTofaa, 8 ^ 
cancers. 

In addition, this gene is expressed at moderate levels in cerebellum and fetal brain. 
Therefore, therapeutic modulation of this gene product may be useful in the treatment of 
5 central nervous system disorders such ataxia and autism. 

Panel 4.1D Summary: Ag5255/Ag6844 Two experiments with different probe 
and primer sets are in good agreement. The highest expression of this gene is detected in 
TNF alpha activated dermal fibroblast and LPS activated monocytes (CTs=32.7-32.9). 
Moderate to low levels of expression of this gene is detected in activated polarized T cells, 
10 naive and memory T cells, PMA/ionomycin activated LAK cells, resting IL-2 treated NK 
cells, eosinophils, resting dendritic cells, activated basophils, resting keratinocyte, and 
activated mucoepidermoid NCI-H292 cells. Therefore, therapeutic modulation of this gene 
or its protein product may be useful in the treatment of autoimmune and inflammatory 
diseases such as asthma, allergies, inflammatory bowel disease, lupus erythematosus, 
15 psoriasis, rheumatoid arthritis, and osteoarthritis. 

AE. CG149964-01: Brain mitochondrial carrier protein-1. 

Expression of gene CG149964-01 was assessed using the primer-probe set Ag7056, 
described in Table AJEA. 

Table AEA. Probe Name Ag7056 

20 



Primers 


Sequencs 


Length 


Start 
Position 


SEQID 
No 


Forward 


5 1 -tgtggttccaactgctcag-3 1 


19 


617 


360 


Probe 


TET-5 ' -ctggtagctctactcctacaacgatgg 
cag-3 ' -TAMRA 


30 


640 


361 


Reverse 


5 ' -agatccacatgtcccatcatt-3 1 


21 


707 


362 



GeneraI_screening_paneLvl.6 Summary: Ag7056 Expression of this gene is 
25 low/undetectable in all samples on this panel (CTs>35). 

AF. CG150799-01, CG150799-02 and CG150799-03: MASSL 

Expression of gene CG150799-01, CG150799-02 and CG150799-03 was assessed 
using the primer-probe sets Ag5242, Ag5243, Ag5244, Ag5245, Ag5247 and Ag5248, 
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described in Tables AFA, AFB and AFC. Results of the 

Tables AFD, AFE, AFF, AFG, AFH and AFI. Please note that probe-primer sets Ag5243 is 
specific for CG150799-02 and probe-primer sets Ag5244 and Ag5245 are specific for 
CG150799-03. 
5 Table AFA. Probe Name Ag5242 



Primers 




Length 


Start 
Position 


SEQID 
No 


Forward 


5 ' -acgaatcccatgtgacacttt-3 1 


21 


3624 


363 


Probe 


TET-5 * -cccttcattataaaaccttgggttcc 
a- 3 1 -TAMRA 


27 


3645 


364 


Reverse 


5 ' -tgactgttgtcttggcaatgt-3 * 


21 


3681 


365 



10 Table AFB. Probe Name Ag5243 



Primers 


Sequence 


Length 


Start 
Position 


SEQID 
No 


Forward 


5 ' -gactccttccaaaggctatattgt-3 1 


24 


8809 


366 


Probe 


TET-5 • -cgattcaaggccctacaaatatctgcc 
a- 3 1 -TAMRA 


28 


8849 


367 


Reverse 


5 ' -ccatttctggttccgtgtcta-3 1 


21 


8880 


368 



15 Table AFC. 

Probe Name g5244 



Primers 


Sequences 


Length 


Start 
Position 


SEQID 
No 


Forward 


5 ' -actgataattctattcctgaactgga-3 ' 


26 


4927 


369 


Probe 


TET-5 ■ -agctctgctagatctatctacagatataac 
gctgtaaaatc-3 ' -TAMRA 


41 


4992 


370 


Reverse 


5 ' -aactcattatagatcatccaaaagga-3 1 


26 


5036 


371 



20 

Table AFP. 

Probe Name g5245 



Primers 


Sequences 


Length 


Start 
Position 


SEQID 
No 
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5 



10 



Forward 


_ pe 

5 1 -accttgttgatgactttgctaatg-3 ' 




Sr 63 


&p7" 


Probe 


TET-5 1 -cagtggaactattacattccttccttgg 
caga-3 1 -TAMRA 


32 


4345 


373 


Reverse 


5 • -ggaagcgacacttcaatcaaa-3 1 


21 


4387 


374 


Table AFE. Probe Name Ae5247 


Primers 


Sequenes 


Length 


Start 
Position 


SEQID 
No 


Forward 


5 1 -acttacgttggacttaccatgg-3 * 


22 


8183 


375 


Probe 


TET-5 1 -caacttcatttcctcccagactaggtat 
gagg-3 ' -TAMRA 


32 


8211 


376 


Reverse 


5 • -tcatttcatttgaagtgagcaa-3 ' 


22 


8263 


377 


Table AFF. Probe Name Ag5248 


Primers 


Sequenes 


Length 


Start 
Position 


SEQID 
No 


Forward 


5 ■ -accttgttgatgactttgctaatg-3 ' 


24 


4320 


378 


Probe 


TET-5 ' -cagtggaactattacattccttccttgg 
caga-3 ' -TAMRA 


32 


4345 


379 


Reverse 


5 ' -caagaacatatatattcagaacctctgatc-3 
i 


30 


4377 


380 



Table AFG. AI_comprehensive panel vl.O 

15 



Tissue Name 


Rel. 

Exp.(%) 
Ag5242, 
Run 

305464510 


issue Name 


Rel. 

Exp.(%) 
Ag5242, 
Run 

305464510 


110967 COPD-F 


0.1 


1 12427 Match Control Psoriasis-F 


2.3 


1 10980 COPD-F 


1.1 


1 12418 Psoriasis-M 


0.1 


110968 COPD-M 


0.1 


1 12723 Match Control Psoriasis-M 


0.5 


110977 COPD-M 


4.4 


112419 Psoriasis-M 


0.0 


1 10989 Emphysema-F 


0.2 


1 12424 Match Control Psoriasis-M 


0.2 


110992 Emphysema-F 


2.7 


112420 Psoriasis-M 


1.8 


110993 Emphysema-F 


0.1 


112425 Match Control Psoriasis-M 


3.7 


1 10994 Emphysema-F 


0.1 


104689 (MF) OA Bone-Backus 


0.2 


110995 Emphysema-F 


6.8 


104690 (MF) Adj "Normal" 
Bone-Backus 


0.6 
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110996 Emphysema-F 


2.0 


i ir* ip T" ' •" n j' ^"ii o w& 

104691 (MF) 6A^ynovium-ffac1kSS " 


0T 


110997 Asthma-M 


0.1 


104692 fRA* OA Partilaire-BBrWiic 


O A 


111001 Asthma-F 


0.5 


104694 /RA 1 * OA Bone-Backin 


fi 0 


111002 Asthma-F 


0.9 


104695 ABA i Adi "Normal" 
Bone-Backus 


0.4 


111003 Atopic Asthma-F 


1.5 


104696 (BA) OA Synovium-Backus 


0.1 


1 1 1004 Atopic Asthma-F 


6.1 


104700 (SS) OA Bone-Backus 


0.9 


1 1 1UU5 Atopic Asthma-F 


2.5 


104701 (SS) Adj "Normal" 
Bone-Backus 


0.6 


111006 Atopic Asthma-F 


0.9 


104702 (SS) OA Synovium-Backus 


0.2 


111417 AIlergy-M 


0.8 


1 17093 OA Cartilage Rep7 


0.9 


112347 Allergy-M 


0.0 


112672 OA Bone5 


0.0 


112349 Normal Lung-F 


0.0 


112673 OA Synoviums 


0.1 


112357 Normal Lung-F 


1.0 


112674 OA Synovial Fluid cells5 


0.2 


112354 Normal Lung-M 


0.7 


117100 OA Cartilage Repl4 


0.0 


112374 Crohns-F 


0.5 


112756 OABone9 


100.0 


112389 Match Control Crohns-F 


0.2 


112757 OA Synovium^ 


6.4 


1 12375 Crohn <?-F 


fi 1 


iiz/Do vja synovial rluia Cellsy 


0.1 


1 19737 iM^trh fnntrnl fYrJinc "P 
ill / IYJ.al.HI v^UIJlIUJ V^rUlJXlO'X^ 


fi 3 


1 j / j lj ka l, arm age Kepz 


0.0 


112725 Crohns-M 


0.1 


113492 Bone2RA 


31.6 j 


1 12387 Match Control 

Crohn c-A^ 

WJ 111 J" J YJ. 


0.1 


113493 Synovium2RA 


11.8 


112378 Crohns-M 


0.0 


113494 Syn Fluid Cells RA 


22.2 


1 12390 Match Control 

Crohn c-A/T 

V--1 UIJJlo 1YA 


1.5 


113499 Cartilage4RA 


22.7 


112726 Crohns-M 


1.2 


113500 Bone4RA 


28.1 


1 1273 1 Match Control 
Crohns-M 


0.9 


113501 Synovium4RA 


20.2 


112380 Ulcer Col-F 


1.0 


113502 Syn Fluid Cells4 RA 


16.4 


1 12734 Match Control Ulcer 
Col-F 


0.8 


113495 Cartilage3RA 


22.7 


112384 Ulcer Col-F 


3.7 


113496 Bone3RA 


24.5 


1 12737 Match Control Ulcer 
Col-F 


0.8 


113497 Synovium3RA 


14.7 


112386 Ulcer Col-F 


0.2 


113498 Syn Fluid Cells3RA 


33.0 


l juvjo Match Control Ulcer 
Col-F 


0.5 


1 17106 Normal Cartilage Rep20 


O.O 


112381 Ulcer Col-M 


0.0 


113663 Bone3 Normal 


0.0 


112735 Matrh Control TTIrpr 
Col-M 


0.0 


1 13664 Synovium3 Normal 


0.0 


112382 Ulcer Col-M 


0.3 


113665 Syn Fluid Cells3 Normal 


0.0 


1 12394 Match Control Ulcer 
Col-M 


0.1 


1 17107 Normal Cartilage Rep22 


0.1 


112383 Ulcer Col-M 


4.5 


113667 Bone4 Normal 


0.4 


112736 Match Control Ulcer 
Col-M 


0.3 


1 13668 Synovium4 Normal 


0.1 
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1 12423 Psoriasis-F j0.2 jl 13669 Syn Ffuifcells4 jgjgg * %jj * f ^ j B 



Table AFH.CNS neurodegeneration vl.O 



Tiss 
ue 
Na 
me 


Rel. 

ft) 

Ag52 

42, 

Run 

2296 

6154 

6 


Rel. 

Pvrk 
JC/Xp. 

(ft) 

Ag5 
242, 
Run 
2336 
0987 
6 


Rel. 

(%) 

Ag52 

43, 

Run 

2296 

6154 

7 


Rel. 

JC/Xp. 
(«) 

Ag5 
243, 
Run 
2768 
6356 
6 


Rel. 

CiXp. 
(%) 

Ag52 

43, 

Run 

2777 

3146 

0 


Rel. 
Exp.( 

ft) 

Ag524 
4, 

Run 

22966 

1548 


Rel. 

JcrXp. 
(ft) 

Ag5 
244, 
Run 
2336 
1076 
2 


Rel. 

£/Xp. 

(%) 

Ag52 

44, 

Run 

2777 

3146 

1 


Rel. 

H/Xp. 

(%> 

Ag5 
245, 
Run 
2296 
6154 
9 


Rel. 
n>xp. 

(%) 

Ag52 

45, 

Run 

2305 

1032 

0 


Rel. 

J&Xp. 

(ft) 

Ag5 
247, 
Run 
2296 
6155 
0 


Rel. 

T?vn. 

JbXp. 

(%) 

Ag52 

47, 

Run 

2768 

6357 

0 


Rel. 
Exp. 

(%> 

Ag5 
248, 
Run 
2296 
6155 
1 


Rel. 
Exp. 

(%> 

Ag52 
48, 

Run 

2768 
6357 
2 


ReL 
Exp.( 

%) 

Ag52 

48, 

Pun 

2777 
3146 
6 


AD 
1 

Hip 
po 


22.4 


21.6 


29.3 


31.6 


27.5 


9.1 


0.0 


3.1 


16.0 


0.0 


9.0 


6.7 


14.9 


13.7 


17.9 


AD 

2 

Hip 
po 


47.3 


42.0 


54.7 


53.2 


44.8 


0.0 


2.9 


4.0 


16.2 


4.6 


41.8 


21.8 


44.4 


32.8 


32.5 


AD 

3 

Hip 
po 
AD 
4 

Hip 
po 


12.2 


13.5 


17.8 


13.6 


10.9 


0.0 


0.0 


0.0 


0.0 


0.0 


5.8 


0.0 


9.8 


4.8 


6.8 


14.8 


14.4 


16.6 


17.7 


20.6 


0.0 


0.0 


0.0 


23.2 


7.6 


17.3 


8.6 


12.8 


6.4 


7.0 


AD 

5 

Hip 
po 


65.5 


84.1 


61.6 


63.7 


57.4 


6.7 


0.0 


4.3 


11.6 


5.3 


84.7 


31.0 


85.3 


61.1 


62.0 


AD 
6 

Hip 
po 


56.3 


59.5 


82.4 


84.7 


90.1 


74.2 


57.8 


51.8 


58.6 


30.8 


100. 
0 


92.0 


69.3 
27.7 


48.3 


55.5 


Con 
trol 
2 

Hip 
po 


29.5 


25.7 


29.3 


31.6 


31.9 


0.0 


0.0 


5.5 


15.1 


29.1 


42.0 


29.9 


25.0 


26.1 


Con 
trol 
4 

Hip 
po 


32.8 


29.7 


35.6 


31.2 


37.1 


8.1 


11.3 


0.0 | 


0.0 


0.0 


27.0 


23.7 


25.2 


20.9 


16.5 
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Con 
trol 
(Pat 
h)3 
Hip 
po 



33.9 



AD 
1 

Te 
mp 
oral 
Ctx 



AD 
2 

Te 
mp 
oral 
Ctx 



323 



33.9 



24.7 



33.9 



35.8 



42.3 



24.0 



32.3 



30.8 



34.6 



35.4 



39.5 



51.1 



0.0 



4.5 



2.0 



46.3 



3.3 



0.0 



5.4 



2.9 



5.4 



0.0 



8.2 



0.0 



9.3 



19.5 



14.2 



0.0 



13.0 



12.2 



13.1 



19.8 



22.1 



29.9 



21.9 



28.7 



32.5 



26.2 



17.9 



38.2 



37.6 



26.1 



100.0 



AD 

3 

Te 
mp 
oral 
Ctx 



28.3 



21.2 



20.4 



23.5 



20.7 



0.0 



0.0 



0.0 



0.0 



0.0 



4.4 



4.5 



12.2 



9.5 



11.3 



AD 
4 

Te 
mp 
oral 
Ctx 



47.3 



44.8 



36.6 



39.0 



45.4 



10.3 



0.0 



8.3 



39.0 



19.2 



43.5 



25.3 



33.0 



25.9 



29.5 



AD 
5 

Inf 

Te 

mp 

oral 

Ctx 



73.7 



100. 
0 



100,0 



100. 
0 



100.0 0.0 



11.4 



17.6 



24.5 



0.0 



43.5 



74.7 



76.8 



100.0 79.6 



AD 
5 

Sup 

Te 

mp 

oral 

Ctx 



93.3 



77.4 



87.7 



82.4 



88.3 



7.3 



10.7 



8.9 



29.1 



3.3 



45.7 



59.0 



82.4 



70.7 



64.2 



AD 
6 

Inf 

Te 

mp 

oral 

Ctx 



59.0 



58.2 



62.0 



0.0 



58.2 



55.9 



94.0 



49.0 



55.5 



100.0 87.1 



87.7 



71.7 



46.3 



65.1 
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AD 
6 

Sup 

Te 

mp 

oral 

Ctx 


85.3 


99.3 


74.2 


74.7 


90.1 


100.0 


100. 
0 


100.G 


199.3 


73.7 


95.9 


*SHE 
97.3 


VSrr- 
97.3 


60.3 


94.0 


Con 

tro] 

1 

Te 
mp 
oral 
Qx 


47.6 


46.3 


27.4 


28.5 


29.1 


1.7 


0.0 


0.0 


58.2 


27.7 


25.3 


19.1 


44.4 


25.2 


32.3 


Con 
trol 

2 

Te 
mp 
oral 
Ctx 


37.6 


37.4 


30.6 


27.5 


32.8 


2.7 


11.0 


4.5 


31.4 


48.3 


8.9 


15.4 


50.0 


34.4 


29.1 


Con 
trol 

3 

Te 
mp 
oral 
Ctx 


27.5 


24.1 


27.4 


32.8 


37.6 


7.1 


5.4 


2.6 


5.1 


6.3 


16.6 


5.8 


35.6 


21.3 


27.7 


Con 
trol 
3 

Te 
mp 
oral 
Ctx 


38.2 


39.0 


34.6 


30.6 


31.9 


8.7 


2.6 


4.9 j 


8.4 


0.0 


31.4 


13.7 


22.4 


26.1 


31.4 


Con 

trol 

(Pat 

h)l 

Te 

mp 

oral 

Qx 


66.0 


81.2 


54.0 


58.6 


52.5 


2.5 


0.0 


3.2 


78.5 


37.9 


72.7 


72.7 


75.8 


53.3 


69.7 


Con 

trol 

(Pat 

h)2 t 

Te 

mp 

oral 

Ctx 


*3.5 


50.0 


40.1 


U.8 


41.5 ; 


2.0 


10.9 


3.0 


30.1 : 


10.9 i 


42.6 : 


51.9 


42.9 : 


*3.9 i 


Vl.0 
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Con 

trol 

(Pat 

h)3 

Te 

mp 

oral 

Ctx 


23.3 


24.5 


19.9 


21.5 


22.4 


2.9 


2.3 


7.7 


1 ' 

0.0 


4.1 


5.8 


1 &G 
4.3 


21.0 


3-dr 
14.5 


16.2 


Con 

trol 

(Pat 

h)4 

Te 

mp 

oral 

Ctx 


52.5 


48.0 


33.7 


39.8 


39.0 


0.0 


4.7 


4.3 


49.3 


43.2 


73.7 


49.3 


40.6 


32.5 


47.6 


AD 
1 

Occ 
ipit 
al 

Ctx 


18.0 


18.8 


22.8 


25.7 


24.3 


0.0 


3.0 


0.0 


0.0 


0.0 


10.2 


13.8 


19.9 


12.8 


14.3 


AT\ 

nU 
2 

Occ 
ipit 
al 

Qx 
(Mi 
ssin 
g) 


0.0 


0.0 


0.0 


0.0 


0.0 


0.0 


0.0 


0.0 


0.0 


0.0 


0.0 


0.0 


0.0 


0.0 


0.0 


AD 

3 

Occ 
ipit 
al 

Ctx 


15.5 


14.0 


17.8 


17.8 


18.0 


0.0 


0.0 


0.0 


3.2 


0.0 


10.2 


0.0 


10.3 


5.2 


5.5 


AD 
4 

Occ 
ipit 
al 

Ctx 


17.3 


23.7 


25.3 


27.5 


24.3 


3.3 


3.1 


3.3 


28.7 


6.7 


22.2 


23.0 


8.6 


17.0 


21.5 


AD 
5 

Occ 
ipit 
al 

Ctx 


22.4 


26.1 


21.3 


15.2 


22.5 


2.0 


3.1 


5.3 


25.7 


5.1 


16.7 


8.7 


3.3 


20.9 


18.4 
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HSH 




AD 
6 

Occ 
ipit 
al 

Ctx 


28.9 


21.6 


19.1 


20.4 


18.9 


11.7 


15.6 


2.8 


0.0 


18.2 


12.1 


0.0 


29.9 


18.0 


24.7 


Con 
trol 
1 

Occ 
ipit 
al 

Ctx 


9.3 


10.2 


6.8 


6.1 


7.4 


0.0 


0.0 


0.0 


4.3 


7.8 


5.6 


0.0 


4.8 


3.7 


3.3 


Con 
trol 
2 

Occ 
ipit 
al 

Ctx 


34.9 


33.0 


24.7 


28.7 


31.6 


0.0 


5.1 


7.4 


31.2 


17.9 


7.9 


4.6 


39.5 


20.2 


28.3 


Con 
trol 

3 

Occ 
ipit 
al 

Ctx 


27.2 


24.1 


27.5 


25.2 


24.5 


2.4 


9.2 


4.2 


7.0 


0.0 


13.8 


0.0 


14.5 


17.6 


16.8 


Con 
trol 
4 

Occ 
ipit 
al 

Ctx 


19.6 


20.3 


18.0 


26.8 


21.2 


0.0 


1.6 


0.0 


0.0 


0.0 


12.5 


5.6 


15.4 


8.8 


12.9 


Con 
trol 
(Pat 
h)l 
Occ 
ipit 
al 

Ctx 


56.6 


64.6 


48.6 


58.6 


57.8 


9.1 


5.1 


7.8 


66.4 


30.6 


69.7 


57.0 


76.3 


42.6 


55.5 


Con 
trol 
(Pat 
h)2 
Occ 
ipit 
al 

Ctx 


5.7 


6.1 


9.0 


7.1 


8.5 


2.0 


0.0 


0.0 


5.6 


0.0 


1.6 


0.0 


16.3 


3.8 


4.1 
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Con 

trol 

(Pat 

h)3 

Occ 

ipit 

a] 

Ctx 


2.6 


3.1 


4.1 


1.9 


4.5 


0.0 


0.0 


0.0 


1 — 1 

0.0 


0.0 


0.0 


0.0 


1.5 


1.8 


1.7 


Con 

trol 

(Pat 

h)4 

Occ 

ipit 

al 

Ctx 


9.9 


11.2 


5.4 


9.2 


8.2 


0.0 


0.0 


0.0 


11.7 


15.7 


5.1 


7.2 


2.1 


0.3 


5.0 


Con 
trol 
1 

Pari 
etal 
Ctx 


28.9 


32.5 


19.6 


21.8 


22.4 


0.0 


0.0 


0.0 


0.0 


3.6 


9.8 


16.4 


23.8 


16.3 


17.1 


Con 
trol 
2 

Pari 
etal 
Ctx 


100.0 


90.8 


79.0 


83.5 


76.3 


7.9 


23.8 


9.7 


26.8 


12.2 


39.0 


37.9 


100. 
0 


44.1 


63.3 


Con 

trol 

3 

Pari 
etal 
Ctx 


14.8 


11.9 


17.3 


15.3 


17.0 


0.0 


9.8 


0.0 


0.0 


0.0 


1.7 


7.2 


12.9 


8.5 


11.3 


Con 
trol 
(Pat 
h)l 
Pari 
etal 
Ctx 


62.4 


68.3 


57.8 


70.2 


63.7 


4.2 


3.8 


0.0 


100. 
0 


55.9 


41.5 


100.0 


99.3 


53.2 


71.2 


Con 
trol 
(Pat 
h)2 
Pari 
etal 
Qx 


17.1 


19.8 


22.1 


21.0 


25.9 


1.9 


10.4 


0.0 


30.8 


0.0 


17.9 


18.0 


6.3 


10.2 


15.8 
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Con 
trol 
(Pat 
h)3 
Pari 
eta] 
Ctx 


12.0 


10.2 


11.7 


8.4 


13.9 


2.8 


5.3 


0.0 


1 * 

6.3 


0.0 


V4= 
3.9 


HB-0 
0.0 


3.2 


3h£n 
4.9 


3^ 

4.2 


Con 
trol 
(Pat 
h)4 
Pari 
etal 
Ctx 


30.1 


25.5 


26.1 


25.7 


29.1 


1.5 


0.0 


0.0 


59.0 


40.9 


23.7 


30.1 


25.9 


26.4 


17.8 



Table AFI. General screening panel vl.5 



Tissue 
Name 


ReL 
Exp.( 

%) 

Ag524 
2. 

Run 

22966 

5046 


ReL 

Exp.( 

%) 

Ag52 

43 

Run 

22966 

5047 


ReL 
Exp 

(%) 
Ag524 

Run 

229665 

049 


ReL 
Exp.( 

%) 

Ag524 

7 

Run 

229665 

052 


ReL 
Exp.( 

%) 

Ag524 
s 

Run 

22966 
5053 


Tissue 


ReL 
Exp.( 

%) 

Ag524 
*> 

Run 

229665 
046 


ReL 
Exp.( 

%) 

Ag524 
% 

^9 

Run 

22966 

5047 


ReL 

Exp.( 

%) 

Ag524 
5, 

Run 

22966 

5049 


ReL 

Exp.( 

%) 

Ag524 

T 

'» 

Run 

22966 
5052 


Rel. 
Exp.( 

%) 

Ag52 

48, 

Run 

22966 

5053 


Adipose 


0.1 


0.0 


0.0 


0.0 


0.0 


Renal ca. 
TK-10 


0.0 


0.1 


0.0 


0.0 


0.0 


Melanoma 
* 

Hs688(A). 
T 


0.8 


0.5 


0.0 

— — 


0.0 


1.2 


Bladder 


2.6 


1.8 


0.0 


2.5 


3.7 


Melanoma 
+ 

Hs688(B). 
T 


0.1 


0.0 


0.0 


0.0 


0.0 


Gastric ca. 
(liver 
met) 
NCI-N87 


0.0 


0.0 


0.0 


0.0 


0.0 


Melanoma 
*M14 


0.2 


0.3 


0.0 


0.0 


0.1 


Gastric ca. 
KATOm 


0.0 


0.0 


0.0 


0.0 


0.1 


Melanoma 

LOXEMV 
I 


0.9 


0.2 


0.0 


0.0 


0.1 


Colon ca. 
SW-948 


5.2 


4.6 


0.4 


0.6 


3.7 


Melanoma 
* 

SK-MEL- 
5 


0.6 


1.6 


0.0 


1.2 


0.0 


Colon ca. 
SW480 


4.6 


3.7 


0.0 


1.1 


5.9 


Squamous 
cell 

carcinoma 
SCC^ 


0.0 


0.0 


0.0 


0.0 


0.0 


Colon ca.* 
(SW480 
met) 
SW620 


0.1 


0.0 


0.0 


0.0 


0.0 
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Testis 
Pool 


2.9 


3.4 


0.0 


3.3 


3.8 


Colon ca. 
HT29 


0.0 


0.0 


0.0 


f3i 
0.0 


0.0 


Prostate 
ca.* (bone 
met) PC-3 


89.5 


86.5 


5.8 


18.2 


100.0 


Colon ca. 
HPT-116 

XXKs i - X X u 


12.2 


11.9 


0.4 


4.4 


14.2 


Prostate 
Pool 


10.7 


8.5 


0.7 


2.4 


7.0 


Colon ca. 
CaCo-2 


13.8 


14.0 


8.9 


6.6 


16.4 


Placenta 


0.0 


0.1 


0.0 


1.0 


0.1 


Colon 
cancer 
tissue 


0.0 


0.0 


0.0 


0.0 


0.0 


Uterus 
Pool 


0.0 


0.1 


0.0 


0.0 


0.1 


Colon ca. 
SW1116 


0.1 


0.0 


0.0 


0.0 


0.0 


Ovarian 
ca. 

OVCAR- 
3 


10.5 


18.7 


12.1 


1.7 


16.7 


Colon ca. 
Colo-205 


0.0 


0.0 


0.0 


0.0 


1.2 


Ovarian 
ca. 

SK-OV-3 


0.2 


0.1 


0.0 


0.2 


0.0 


Colon ca. 
SW-48 


0.0 


0.0 


0.0 


0.0 


0.0 


Ovarian 

ca. 

OVCAR- 
4 


0.1 


0.0 


0.0 


0.0 


0.1 


Colon 
Pool 


0.1 


0.0 


0.0 


0.6 


0.1 


Ovarian 
ca. 

OVCAR- 
5 


7.3 


7.1 


0.0 


3.7 


12.1 


Small 

Intestine 

Pool 


3.7 


1.6 


1.6 


1.0 


4.1 


Ovarian 
ca. 

IGROV-1 


1.4 


3.5 


0.0 


0.0 


0.5 


Stomach 
Pool 


1.6 


0.7 


0.0 


0.4 


0.9 


Ovarian 
ca. 

OVCAR- 
8 


8.5 


13.0 


0.9 


0.5 


10.7 


Bone 
Marrow 

Prv\l 

JrOOl 


0.1 


0.0 


0.0 1 


0.0 


0.1 


Ovary 


0.1 


0.4 


0.0 


0.0 


1.0 


Fetal 
Heart 


0.0 


0.0 


0.0 


0.3 


0.0 


Breast ca. 
MCF-7 


11.1 


10.2 


0.0 


3.6 


16.4 


Heart Pool 


0.1 


0.0 


0.0 


0.7 


0.1 


Breast ca. 
MDA-MB 
-231 


3.7 


4.8 


3.2 


0.6 


5.9 


Lymph 
Node Pool 


0.5 


0.0 


0.0 


0.6 


0.1 


Breast ca. 
BT 549 


0.0 


0.0 


0.0 


0.0 


0.0 


Fetal 

Skeletal 

Muscle 


0.2 


0.0 


0.0 


1.1 


0.0 


Breast ca. 
T47D 


10.2 


4.4 


0.0 


3.1 


9.9 


Skeletal 
Muscle 
Pool 


0.0 


0.1 


0.0 


0.8 


0.1 


Breast ca. 
MDA-N 


0.1 


0.2 


0.0 


0.0 


0.5 


Spleen 
Pool 


1.5 


0.1 


0.5 


2.3 


0.6 
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Breast 
Pool 


0.8 


1.9 


0.0 


0.9 


1.5 


1 1 

Thymus 
Pool 


!"' fcg If , 
3.2 


1.7 


1.9 


0.7 


2.9 


Trachea 


7.4 


6.4 


0.9 


20.3 


9.5 


CNS 
cancer 
(glio/astro) 
U87-MG 


4.4 


2.6 


0.3 


1.2 


3.2 


Lung 


0.3 


0.0 


0.0 


0.0 


0.1 


CNS 

cancer 

(glio/astro) 

U-118-M 

G 


0.1 


0.0 


0.0 


0.0 


0.0 


Fetal 
Lung 


25.7 


20.9 


1.7 


6.7 


22.2 


CNS 
cancer 
(neuro;met 
) 

SK-N-AS 


0.0 


0.0 


0.0 


0.0 


0.0 


Lung ca. 
NCI-N417 


3.4 


3.6 


0.0 


0.7 


11.6 


CNS 
cancer 
(astro) 
SF-539 


0.2 


0.0 


0.0 


0.0 


0.1 


Lung ca. 
LX-1 


0.1 


0.0 


0.0 


0.0 
7.7 


0.0 


CNS 
cancer 
(astro) 
SNB-75 


0.1 


0.1 


0.0 


0.0 


0.2 
3.4 


Lung ca. 
NCI-H146 


26.1 


28.9 


27.9 


24.7 


CNS 
cancer 
(gHo) 
SNB-19 


2.0 


4.1 


0.0 


0.6 


Lung ca. 
SHP-77 


100.0 


100.0 


100.0 


42.9 


98.6 


CNS 

cancer 

(gHo) 

cry 9CH 


2.4 


3.3 


0.4 


0.3 


4.1 


Lung ca. 
A549 


0.9 


1.3 


0.0 


0.0 


1.1 


Brain 
(Amygdal 
a) Pool 


13.4 


29.1 


1.8 


4.2 


14.6 


Lung ca. 
NCI-H526 


1.8 


1.1 


0.0 


0.0 


1.9 


Brain 

(cerebellu 

m) 


14.2 


13.4 


0.8 


6.1 


15.6 


Lung ca. 
NCI-H23 


0.0 


0.0 


0.O 


0.0 


0.2 


Brain 
(fetal) 


89.5 


100.0 


15.1 


100.0 


93.3 


Lung ca. 


5.4 


3.3 


9.3 


48.3 


23.5 


Brain 
(Hippoca 
mpus) 
Pool 


35.4 


47.3 


6.6 


13.7 


3L9 


Lung ca. 
HOP-62 


7.0 


8.8 


0.0 


0.0 


8.4 


Cerebral 

Cortex 

Pool 


40.1 


53.2 


8.9 


35.1 


39.0 


Lung ca. 
NCI-H522 


0.0 


0.0 


0.0 


0.0 


0.0 


Brain 
(Substanti 
a nigra) 
Pool 


14.2 


33.7 


4.7 


2.2 


16.7 
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Liver 


0.0 


0.0 


0.0 


0.0 


0.0 


Brain 

(Thalamus 

)PooI 


37.9 


43.2 


«3b 
0.8 


25.5 


45.1 


Fetal 
Liver 


0.0 


0.0 


0.0 


0.6 


0.2 


Brain 
(whole) 


13.9 


25.7 


2.1 


13.4 


18.6 


Liver ca. 
HepG2 


0.0 


0.1 


0.0 


0.0 


0.0 


Spinal 
Cord Pool 


2.2 


2.6 


1.7 


1.4 


2.4 


Kidney 
Pool 


1.0 


1.0 


0.0 


0.4 


1.6 


Adrenal 
Gland 


0.7 


0.7 


0.8 


1.9 


0.3 


Feta] 
Kidney 


8.5 




1 0 

1 .1/ 


U.J 




Pituitary 
gland Pool 


1 a ^ 

lo.O 


10. 3 


j.U 


3.2 


36.6 


Renal ca. 
786-0 


0.0 


0.0 


0.0 


0.0 


0.0 


Salivary 
Gland 


0.1 


0.5 


0.0 


0.0 


0 1 


Renal ca. 
A498 


0.1 


0.0 


0.0 


0.0 


0.0 


Thyroid 
(female) 


11.6 


12.2 


0.2 


0.7 


9.4 


Renal ca. 
ACHN 


0.4 


0.1 


0.0 


0.0 


0.5 


Pancreatic 
ca. 

CAPAN2 


0.1 


0.0 


0.0 


0.0 


0.1 


Renal ca. 
UO-31 


0.0 


0.1 


0.0 


0.0 


0.0 


Pancreas 
Pool 


3.6 


2.3 


0.0 


1.7 


2.0 



Table AFJ. General screening_panel vl.6 



Tissue Name 


Rel. 

Exp.(%) 
Ag5243, 
Run 

27721871 
9 


ReL 

Ex.(%) 

Ag5243, 

Run 

2777299 

29 


Rel. 

Exp.(%) 
Ag5245, 
Run 

27721969 
7 


ReL 

Exp.(%) 
Ag5245, 
Run 

27773087 
9 


Rel. 

Exp.(%) 
Ag5247, 
Run 

27721969 
9 


Rel. 

Exp.(%) 
Ag5247, 
Run 

27772993 
3 


Rel. 

Exp.(%) 
Ag5248, 
Run 

27721970 
1 


ReL 

Exp.(%) 
Ag5248, 
Run 

27773088 
1 


Adipose 


0.1 


0.2 


0.0 


0.0 


0.0 


0.0 


0.1 


0.1 


Melanoma* 
Hs688(A).T 


0.0 


0.0 


0.0 


0.0 


0.0 


0.0 


0.0 


0.0 


Melanoma* 
Hs688(B).T 


0.0 


0.0 


0.0 


0.0 


0.0 


0.0 


0.2 


0.0 


Melanoma* 
M14 


0.2 


0.0 


0.7 


0.0 


0.0 


0.0 


0.0 


0.3 


Melanoma* I 
LOXIMVI 


0.2 


0.1 


0.0 


0.0 


0.0 


0.0 


0.1 


0.2 


Melanoma* 
SK-MEL-5 


2.5 


1.3 


0.0 


0.0 


0.0 


0.9 


0.1 


0.4 


Squamous cell 

carcinoma 

SCC-4 


0.0 


0.0 


0.0 


0.0 


0.0 


0.0 


0.0 I 


0.1 


Testis Pool 


2.2 


3.4 


3.1 


23 


7.1 


3.5 


2.7 


2.8 


Prostate ca.* 
(bone met) 
PC-3 


95.3 


76.8 


11.5 


1.3 


23.7 


20.3 


76.8 


63.3 
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Prostate Pool 


6.8 


7.5 


0.0 


0.0 


61^ 


l/'U 3 


7ft p/ 


7.6 


Placenta 


0.0 


0.0 


0.0 


0.0 


0-0 


0.0 


0.1 


0.1 


Uterus Pool 


0.0 


0.0 


0.0 


0.0 


0.0 


0.0 


0.0 


0.1 


Ovarian ca. 
OVCAR-3 


13.2 


11.7 


9.5 


4.0 


3.3 


5.2 


11.6 ! 


14.5 


Ovarian ca. 
SK-OV-3 


0.2 


0.3 


0.0 


0.0 


0.0 


0.0 


0.1 


0.3 


Ovarian ca. 
OVCAR^ 


00 


0.0 


0.0 


0.0 


0.0 


0.0 


0.0 


0.0 


Ovarian ca. 
OVCAR-5 


fi f% 
u.u 


7.4 


2.3 


00 


4.7 


0.8 


4.7 


5.1 


Ovarian ca. 
IGROV-1 


2.0 


2.8 


0.7 j 


0.0 


0.0 


0.0 


1.1 


3.3 


Ovarian ca. 
OVCAR-8 


14.2 


8.1 


3.6 


0.0 


7.5 


8.1 


8.2 


13.4 


Ovary 


0.1 


0.6 [0.0 


0.0 


0.0 


0.0 


0.7 


0.2 


Breast ca. 
MCF-7 


7.4 


8.0 


0.0 


0.0 


3.5 


9.4 


8.0 


9.2 


Breast ca. 

MDA-MB-23 

1 


6.5 


3.0 


2.4 


2.5 


0.4 


0.7 


4.1 


6.4 


Breast ca. BT 
549 


0.0 


0.0 


0.0 


0.0 


0.0 


1.0 


0.0 


0.0 


Breast ca. 
T47D 


6.7 


3.8 


0.8 


0.0 


5.5 


1.5 


4.7 


8.0 


Breast ca. 

A/TTN A XT 


0.0 


0.2 


0.5 


0.0 


0.0 


0.5 


0.1 


0.3 


Breast Pool 


0.2 


0.1 


0.9 


0.0 


0.0 


0.0 


0.5 


03 


Trachea 


18.6 


15.6 


3.9 


0.0 


14.6 


18.0 


5.5 


7.6 


Lung 


0.2 


0.0 


0.0 


0.0 


0.0 


1.2 


0.0 


0.1 


Fetal Lung 


21.3 


21.0 


0.0 


0.7 


10.3 


5.1 


19.3 


23.7 


Lung ca. 
NCI-N417 


6.3 


3.2 


0.0 


0.0 


1.7 


4.2 


2.4 


2.0 


Lung ca. LX-1 


0.0 


0.0 


0.0 


0.0 


0.0 


0.0 


0.0 


0.0 


Lung ca. 
NCI-H146 




20.4 


17.0 


100 0 


7 1 


9 8 


16.8 


16.4 


Lung ca. 
SHP-77 


Q$ O 

y j.y 


77.9 


100.0 


35 6 


74 7 


31 9 


100 0 


76 3 


Lung ca. A549 


1.0 


0.4 


0.0 


0.0 


0.0 


0.0 


0.3 


1.1 


Lung ca. 
NCI-H526 


1.4 


1.9 


0.0 


0.0 


0.0 


0.0 


0.7 


0.5 


Lung ca. 
NCI-H23 


0.0 


0.0 


0.0 


0.0 


0.0 


0.0 


0.0 


0.1 


Lung ca. 
NCI-H460 


2.8 


2.1 


0.0 


0.0 


0.9 


0.9 


3.1 


3.4 


Lung ca. 
HOP-62 


12.4 


6.5 


0.0 


0.0 


0.6 


0.0 


9.4 


11.6 
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Lung ca. 
NCI-H522 


o.o ; 


0.0 


0.0 


0.0 


— irttr 
0.0 


0.0 


0.0 


X J v .„: 
0.0 


Liver 


u.u 


a a 

u.u 


u.u 


u.u 


o.u 


u.u 


A A 

U.U 


f\ A 

0.0 


Fetal Liver 


0.0 


0.0 


0.0 


0.0 


0.0 


0.9 


0.2 


0.0 


Liver ca. 
HepG2 


0.2 


0.0 | 


o.o : 


0.0 


0.0 


0.0 


0.0 


0.1 


Kidney Pool 


0.5 


0.9 


0.0 


0.0 


1.0 


0.0 


0.6 


1.8 


Fetal Kidney 


5.8 


6.8 


0.0 


0.0 


11.4 


6.6 


4.3 


7.9 


Renal ca. 
786-0 


0.0 


0.0 


0.0 


0.0 


0.0 


0.0 


0.0 


0.0 


Renal ca. 
A498 


0.0 


0.0 


00 

v.v 


0.0 


00 

v.v 


0.0 


00 

v.v 


00 

v.v 


Renal ca. 
ACHN 


0.0 


0.2 


0.0 


0.0 


0.0 


0.0 


0.2 


0.1 


Renal ca. 
UO-31 


0.2 


0.2 


0.0 


0.0 


0.0 


0.0 


0.0 


0.1 


Renal ca. 
TK-10 


0.0 


0.0 


0.0 


0.0 


0.0 


0.0 


0.0 


0.0 . 


Bladder 


1.2 


1.5 


0.0 


0.0 


3.8 


1.4 


3.3 


3.2 


Gastric ca. 
(liver met.) 
NCI-N87 


0.0 


0.0 


0.0 


0.0 


0.0 


0.0 


0.0 


0.0 


Gastric ca. 
KATOm 


0.0 


0.0 


0.0 


0.0 


0.0 


0.0 


0.0 


0.0 


Colon ca. 
SW-948 


4.0 


4.4 


0.7 


0.0 


2.8 


0.6 


3.6 


3.8 


Colon ca. ' 
SW480 


3.6 


4.0 


0.5 


0.0 


0.0 


2.3 


2.7 


4.2 


Colon ca.* 

(SW480met) 

SW620 


0.2 


0.0 


0.0 


0.0 


0.0 


0.0 


0.0 


0.0 


Colon ca. 
HT29 


0.0 


0.0 


0.0 


0.0 


0.0 


0.0 


0.0 


0.0 


Colon ca. 
HCT-116 


13.8 


12.7 


1.0 


0.0 


6.8 


3.1 


5.6 


14.7 


Colon ca. 
CaCo-2 


18.8 


14.9 


10.8 


4.7 


10.2 


10.1 


2.4 


11.6 


Colon cancer 
tissue 


0.0 


0.0 


0.0 


0.0 


0.0 


0.0 


0.0 


0.1 


Colon ca. 
SW1116 


0.0 


0.0 


0.0 


0.0 


0.0 


0.0 


0.0 j 


0.0 


Colon ca. 
Colo-205 


0.0 


0.0 


0.0 


0.0 


0.0 


0.0 


0.0 


0.0 


Colon ca. 
SW^8 


0.0 


0.0 


0.0 


0.0 


0.0 


0.0 


0.0 


0.0 


Colon Pool 


0.1 


0.0 


0.0 


0.0 


0.9 


0.0 


0.4 


0.1 


Small Intestine 
Pool 


0.7 


1.4 


1.6 


1.6 


0.7 


3.0 


8.9 


1.7 



459 



WO 03/029424 PCT/US02/31373 







Stomach Pool 


0.6 j 


1.0 


0.0 |0.0 


o.o B 




(ST 




Bone Marrow 
Pool 


0.0 


0.1 


0.0 


0.0 


0.0 


0.6 


0.0 


0.1 


Fetal Heart 


0.0 


0.0 


0.0 


0.0 


0.0 


0.0 


0.0 


0.1 


Heart Pool 


0.0 


0.0 


0.0 


0.0 


0.0 


0.0 


0.0 


0.0 


Lymph Node 
Pool 


0.0 


0.7 


0.0 


0.0 


0.8 


0.0 


0.5 


0.4 


Fetal Skeletal 
Muscle 


0.4 


0.1 


0.0 


0.0 


0.0 


0.0 


0.1 


0.2 


Skeletal 
Muscle Pool 


0.0 


0.0 


0.0 


0.0 


0.0 


0.0 


0.0 


0.0 


Spleen Pool 


0.0 


0.1 


0.6 


0.0 


1.4 


0.0 


0.6 


0.5 


Thymus Pool 


2.0 


2.1 


1.0 


0.7 


1.4 


2.6 


1.9 


3.2 


CNS cancer 

(glio/astro) 

U87-MG 


2.6 


2.5 ! 


0.8 ! 


0.0 


0.7 


0.6 


3.7 


4.3 


CNS cancer 

(glio/astro) 

U-118-MG 


0.3 


0.1 


0.0 


0.0 


0.0 


0.0 


0.0 


0.0 


CNS cancer 
(neuro;met) 
SK-N-AS 


0.0 


0.0 


0.0 


0.0 


0.0 


U.U 


U.U 


A A 


CNS cancer 
(astro) SF-539 


0.0 


0.0 


0.0 


0.0 


0.0 


0.0 


0.1 


0.1 


CNS cancer 

(astro) 

SNB-75 


0.2 


0.3 


0.0 


0.0 


n a 
U.U 


a a 
U.U 


U.D 


A 'X 
U.j 


CNS cancer 


3.1 


2.4 


0.0 


0.0 


0.0 


1.1 


1.9 


3.4 


CNS cancer 
(glio)SF-295 


2.8 


2.2 


0.5 


0.6 


0.9 


2.6 


3.1 


2.8 


Brain 

(Amygdala) 
Pool 


23.2 


18.7 


1 ft 


z.o 


7.1 


2.2 


12.2 


14.0 


Brain 

(cerebellum) 


13.8 


11.7 


3.1 


1.0 


10.2 


11.3 


13.3 


14.1 


brain (retal) 


inn A 

1UU.U 


inn n 


20.6 


14.8 


inn a 


1ftO ft 




inn a 


Brain 

(Hippocampus 
)Pool 


51.1 


40.3 


6.9 


5.3 


25.9 




14.3 


26.8 


35.8 

- 


PoronM I 
v^CICUIal 

Cortex Pool 


52.5 


52.5 


8.2 


0.0 


27.0 


20.9 


31.9 


31.0 


Brain 

(Substantia 
nigra) Pool 


29.5 


29.1 


1.1 


1.7 


5.5 


2.9 


9.7 


12.2 


Brain 

(Thalamus) 
Pool 


48.3 


51.1 


2.2 


2.5 


21.9 


25.2 


17.4 


31.0 


Brain (whole) 


28.7 


30.6 


6.0 


4.2 


15.2 


13.3 


9.2 


14.7 
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Pool 


1.9 


1.3 


1.3 


0.0 


I — re 

1.0 


iT/UGr 

0.0 


:JHia.,''-3 
1.6 


2.2 


/luiv^iicU VJJaJlU 




U.o 


1.5 


0.0 


0.0 


0.8 


0.3 


0.4 


Pituitary gland 
Pool 


1*7 Q 


13.7 


2.6 


7.4 


0.0 


11.1 


13.4 


15.8 


Salivary Gland 


0.2 


0.3 


6.0 


0.0 


0.0 


0.0 


0.3 


0.6 ^ 


Thyroid 
(female) 


12.9 


10.0 


1.4 


0.0 


1.5 


0.8 


8.5 


13.9 


Pancreatic ca. 
CAPAN2 


0.0 


0.0 


0.0 


0.0 


0.0 


0.0 


0.1 


0.0 


Pancreas Pool 


2.6 


3.2 


0.0 


0.0 


0.6 


3.6 |4.5 


3/7 



Table AFK. Panel 4.1T) 



5 



Tissue Name 


Rel. 
Exp.() 

2, 

Run 

229819 

771 


ReL 
Exp.( 

%) 

Ag52 

45, 

Run 

22981 

9577 


Rel. 
Exp.( 

%) 

Ag524 
7, 

Run 

22981 

9792 


Rel. 
Exp.( 

%) 

Ag52 

48, 

Run 

22981 

9793 


Tissue Name 


ReL 

Exp.( 

%) 

Ag52 

42, 

Run 

22981 

9771 


Rel. 
Exp.( 

%) 

Ag524 
5, 

Run 
22981 


Rel. 

Exp.( 

IV ) 

AgS2 

47, 

Ran 

22981 

y/yz 


Rel. 
Exp.( 

/0 ) 

Ag524 
S, 

Run 
22981 

0*701 


Secondary Thl 
act 


0.0 


0.0 


0.0 


0.0 


HUVEC IL-lbeta 


0.2 


0.0 


0.0 


0.1 


Secondary Th2 
act 


0.6 


4.1 


0.7 


0.5 


HUVEC IFN gamma 


0.0 


0.0 


0.0 


0.0 


Secondary Trl act 


2.3 


1.2 


0.6 


2.3 


HUVECTNF alpha 
+ IFN gamma 


6.0 


0.0 


2.4 


7.7 


Secondary Thl 
rest 


0.0 


0.0 


0.0 


0.1 


HUVECTNF alpha 
+ IL4 


1.0 


0.0 


0.6 


4.2 


Secondary Th2 
rest 


13.7 


0.6 


5,1 


12.2 


HUVEC IL-11 


9.6 


1.6 


6.4 


9.2 


Secondary Trl 
rest 


15.5 


1.9 


8.7 


14.0 


Lung Microvascular 
EC none 


3.6 


0.9 


1.0 


2.4 


Primary Thl act 


100.0 


71.7 


100.0 


85.3 


Lung Microvascular 
EC TNFalpha + 
IL-lbeta 


0.0 


0.0 


0.0 


0.0 


Primary Th2 act 


27.9 


12.6 


20.4 


28.3 


Microvascular 
Dermal EC none 


0.1 


0.0 


0.0 


3.3 


Primary Trl act 


36.6 


9.4 


24.3 


28.9 


Microsvasular 
Dermal EC 
IWalpha + IL-lbeta 


3.0 


O.O 


3.0 


3.0 


Primary Thl rest 


15.9 


2.9 


5.1 


14.6 


bronchial epithelium 
rNFalpba + ILlbeta 


).l 


3.0 


3.0 ( 


).0 


Primary Th2 rest 


34.2 


3.4 ; 


23.3 : 


29.1 


Small airway f 
epithelium none 


).2 < 


).0 


).0 ( 


).2 
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Primary Trl rest 


12.0 


5.0 


12.7 


12.9 


Small airwa/ " ' 
epithelium TNFalpha 
+ IL-lbeta 


3.1 


i O Sr. 
0.0 


3a 

0.7 


3.7 


CD45RA CD4 
lymphocyte act 




0 0 


0 0 


0 0 


Coronery artery 
SMC rest 


4.1 


00 

\S.\J 


0 ft 




fTl4 

lymphocyte act 


0.0 


0.0 


0.0 


0.2 


Coronery artery 
SMC TNFalpha + 
IL-lbeta 


3.1 


0.0 


0.0 


2.6 


CD8 lymphocyte 
act 


5.6 


2.9 


0.7 


7.3 


Astrocytes rest 


3.8 


0.9 


0.6 


4.0 


Secondary CD8 
lymphocyte rest 






\J.\J 




Astrocytes TNFalpha 
+ IL-lbeta 




no 


ft ft 


n ft 


Secondary CD8 
lymphocyte act 


2.1 


0.0 


0.0 


1.9 


KU-812 (Basophil) 
rest 


0.0 


0.0 


0.0 


0.0 


CD4 lymphocyte 
none 


ft 1 
o.l 




J.O 


7 A 
f .*r 


KU-812 (Basophil) 
PMA/ionomycin 


1Z.0 


1 n 


HO 


10.4 


2ry 

Thl/Th2/Trl_anti 
-CD95CH11 


0.0 


0.0 


0.0 


0.0 


(Keratinocytes) none 


15.7 


15.5 


4.3 


15.8 


LAK cells rest 


0.1 


0.0 


0.6 


0.1 


CCD1106 
(Keratinocytes) 
TNFalpha + IL-lbeta 


0.0 


0.0 


0.0 


0.0 


LAK cells TL-2 


0.3 


0.0 


0.0 


0.3 


Liver cirrhosis 


0.1 


0.0 


0.0 


0.0 


LAK cells 
IL-2+IL-12 


25.2 


3.1 


4.8 


24.0 


NCI-H292none 


0.0 


0.0 


0.0 


0.0 


LAK cells 
IL-2+IFN gamma 


0.2 


0.0 


0.0 


1.1 


NCI-H292 L4 


0.0 


0.0 


0.0 


0.0 


LAK cells IL-2+ 
DL-18 


0.5 


0.0 


0.7 


0.7 


NCI-H292 IL-9 


0.0 


0.0 


0.0 


0.0 


LAK cells 
PMA/ionomycin 


0.2 


0.0 


0.6 


0.0 


NCI-H292 IL-13 


0.2 


0.0 


0.0 


0.0 


NK Cells IL-2 
rest 


0.5 


1.9 


0.0 


0.5 


NCI-H292 IFN 
gamma 


0.0 


0.0 


0.0 


0.0 


Two Way MLR 3 
day 


4.5 


5.1 


0.7 


2.3 


HPAEC none 


0.0 


0.0 


0.0 


0.0 


Two Way MLR 5 
day 


6.7 


14.9 


9.5 


15.0 


HPAEC TNF alpha 
+ BL-1 beta 


0.1 


0.0 


0.0 


0.0 


Two Way MLR 7 
day 


0.2 


0.0 


0.0 


0.1 


Lung fibroblast none 


0.0 


0.0 


0.0 


0.0 


PBMC rest 


8-7 


0.0 


2.3 


6.0 


Lung fibroblast TNF 
alpha + IL-1 beta 


19.9 


25.7 


4.4 


22.8 


rbML rWM 


0.2 


0.0 


0.0 


0.4 


Lung fibroblast IL-4 


72.2 


100.0 


32.8 


49.7 


PBMC PHA-L 


0.2 


0.0 


0.0 


0.1 


Lung fibroblast IL-9 


1.2 


0.0 


0.4 


0.6 


Ramos (B cell) 
none 


3.6 


2.2 


1.1 


1.9 


Lung fibroblast 
IL-13 


1.8 


0.0 


1.5 


1.2 


Ramos (B cell) 
ionomycin 


1.8 


3.6 


1.5 


2.2 


Lung fibroblast IFN 
gamma 


0.0 


0.0 


0.0 


0.0 


B lymphocytes 
PWM 


1.3 


0.0 


2.0 


1.1 


Dermal fibroblast 
CCD1070 rest 


0.1 


0.0 


0.0 


0.0 
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B lymphocytes 
CD40LandIL-4 


0.8 


0.7 


1.2 


1.5 


Dermal fibrdbMst 11 * 
CCD1070TNF alpha 


2.9 


Ur31> 
0.0 


Bra 

1.3 


5.3 


EOL-1 dbcAMP 


3.7 


6.7 


3.3 


2.0 


Dermal fibroblast 
CCD1070IL-1 beta 


6.3 


0.0 


1.7 


7.7 


EOL-1 dbcAMP 
PMA/ionomycin 


3.0 


0.0 


2.3 


2.0 


Dermal fibroblast 
IFN gamma 


0.0 


0.0 


0.0 


0.0 


Dendritic cells 
none 


10.7 


1.9 


3.8 


13.6 


Dermal fibroblast 

TT A 


0.0 


0.0 


0.0 


0.0 


Dendritic cells 
LPS 


4.7 


6.2 


11.7 


8.2 


Dermal Fibroblasts 
rest 


0.0 


0.0 


o.o 


0.0 


Dendritic cells 
anti-CD40 


0.1 


0.0 


0.0 


0.0 


Neutrophils 
TNFa+LPS 


0.1 


0.0 


0.0 


0.0 


Monocytes rest 


11.6 


0.6 


2.8 


16.4 


Neutrophils rest 


87.7 


11.7 


28.3 


100.0 


Monocytes LPS 


4.6 


5.6 


1.4 


5.4 


Colon 


0.0 


0.0 


0.0 


0.0 


Macrophages rest 


0.2 


0.0 


0.0 


0.1 


Lung 


0.2 


0.0 


0.0 


0.3 


Macrophages LPS 


11.5 


0.0 


0.9 


9.2 


Thymus 


0.1 


0.0 


0.0 


0.6 


HUVECnone 


0.3 


0.0 


0.0 


0.5 


Kidney 


0.1 


0.0 


1.4 


0.6 


HUVEC starved 


15.9 


8.4 


2.4 


15.5 









Table AFL. genera) oncology screening panel v 2.4 

5 



Tissue Name 


ReL 

Exp.(%) 
Ag5242, 
Run 

26026908 
3 


ReL 

Exp.(%) 
Ag5247, 
Run 

26026913 
2 


ReL 

Exp.(%) 
Ag5248, 
Run 

26026913 
3 


issue Name 


Rel. 

Exp.(%) 
Ag5242, 
Run 

26026908 
3 


ReL 

Exp.(%) 
Ag5247, 
Run 

26026913 
2 


ReL 

Exp.(%) 
Ag5248, 
Run 

26026913 
3 


Colon cancer 1 


0.0 


0.0 


3.5 


Bladder cancer 
NAT 2 


0.0 


0.0 


0.0 


Colon cancer 
NAT 1 


7.2 


0.0 


11.0 


Bladder cancer 
NAT 3 


0.0 


0.0 


0.0 


Colon cancer 2 


0.0 


0.0 


0.0 


Bladder cancer 
NAT 4 


0.0 


0.0 


0.0 


Colon cancer 
NAT 2 


17.6 


16.6 


15.7 


Prostate 

adenocarcinoma 
1 


2.4 


20.9 


5.8 


Colon cancer 3 


4.5 


0.0 


3.8 


Prostate 

adenocarcinoma 
2 


0.0 


0.0 


2.0 


Colon cancer 
NAT 3 


37.1 


0.0 


27.0 


Prostate 

adenocarcinoma 
3 


71.7 


55.9 


54.3 


Colon 
malignant 
cancer 4 


6.1 


0.0 


1.0 


Prostate 

adenocarcinoma 
4 


1.0 


0.0 


7.2 
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Colon normal 
adjacent tissue 
4 


0.0 


0.0 


2.4 


P€ 

rrosiaie cancer 
NAT 5 


'TV US 
4.5 


OB/-3 
0.0 


fr:lL37: 

0.0 


Lung cancer 1 


25.0 


17.9 


4.2 


Prostate 

adenocarcinoma 
6 


30.6 


4.5 


11.1 


Lung NAT 1 


2.3 


3.9 


12.9 


Prostate 

adenocarcinoma 
7 


14.4 


6.3 


23.0 


Lung cancer 2 


40.1 


100.0 


100.0 


Prostate 

adenocarcinoma 
8 


9.1 


5.0 


6.8 


T _ XT ATI 

Lung NAT 2 


32.3 


lo.z 


4o.o 


Prostate 

adenocarcinoma 
9 




iU. / 




Squamous cell 

pfltvinrvmn ^ 


73.2 


47.0 


82.4 


Prostate cancer 
NAT 10 


0.0 


0.0 


7.1 


Lung NAT 3 


13.3 


3.5 


5.8 


Kidney cancer 1 


0.0 


0.0 


0.0 


metastatic 
melanoma 1 


4.4 


0.0 


1.5 


KidneyNAT 1 


33.7 


11.7 


10.7 


Melanoma 2 


0.0 


0.0 


1 A 

1.4 


Kidney cancer 2 


1 A "7 
10.7 


"7 vf 
lA 


2.0 


Melanoma 3 


9.8 


0.0 


4.2 


Kidney NAT 2 


100.0 


42.9 


51.4 


metastatic 
melanoma 4 


2.1 


0.0 


1.0 


Kidney cancer 3 


61.1 


8.6 


24.8 


metastatic 
melanoma 5 


6.4 


9.3 


2.2 


Kidney NAT 3 


63.3 


16.0 


29.9 


Bladder cancer 
1 


0.0 


0.0 


0.0 


Kidney cancer 4 


8.8 


0.0 


1.9 


Bladder cancer 
NAT 1 


0.0 


0.0 


0.0 


Kidney NAT 4 


5.3 


0.0 


9.2 


Bladder cancer 
2 


2.1 


0.0 


0.0 









AI_comprehensive panel_vl.O Summary: Ag5242 Highest expression is seen in 
osteoarthritic bone sample (CT=27.5). Prominenet levels of expression are seen in a cluster 
5 of samples derived from RA. Thus, expression of this gene could be used to differentiate 
between these samples and other samples on this panel and as a marker of rheumatoid 
arthritis. In addition, modulation of the expression or function of this gene may be useful in 
the treatment of RA. 

CNSjneurodegeneration_vl.O Summary: Ag5242/Ag5243/Ag5247/Ag5248 
10 Multiple experiments with four different probe and primer sets produce results that are in 
reasonable agreement. These panels do not show differential expression of this gene in 
Alzheimer's disease. However, these profiles confirm the expression of this gene at 
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moderate levels in the brain. Please see Panel 1.5 for discfiifd50^1iMl^^..ffia4tf^ 3 
nervous system. 

Ag5244 Three experiments with Ag5244, which is specific for CG150799-03, 
detect expression of this gene at low but significant levels in the hippocampus and temporal 
cortex of Alzheimer's patients. This expression may suggest an involvement of this gene 
product in the etiology of this disease. 

One experiment with Ag5244 (Run 276863567) and two experiments with Ag5245 
(Run 276863569 and Run 277731463), also specific for CG150799-03, show 
low/undetectable levels of expression (CTs>35). (Data not shown). Two additional 
experiments with Ag5245 show low expression in samples from the parietal cortex of a 
normal patient and the inferior temporal cortex of an Alzheimer's patient. 

GeneraLscreening_panel_vL5 
Summary: Ag5242/Ag5243/Ag5245/Ag5247/Ag5248 Multiple experiments with five 
different probe and primer sets produce results that are in reasonable agreement. Highest 
expression is seen in cell lines from lung and prostate cancers and the fetal brain 
(CTs=28-30). This gene, which encodes a MASS1 homolog, appears be preferentially 
expressed in the brain, with prominent levels of expression in all regions of the CNS 
examined. MASS1 is a large, calcium-binding GPCR expressed in the central nervous 
system that may play a fundamental role in its development (MacMillan, J Biol Chem 2002 
Jan 4;277(l):785-92). In addition, this gene has been associated with some 
nonsymptomatic epilepsies (Skardski, Neuron, Vol 31, 537-544, August 2001). Thus, based 
on the homology of this protein to MASS1 and the preferential expression in the brain, 
expression of this gene could be used to differentiate between brain and non-neural tissue. 
In addition, therapeutic modulation of the expression or function of this gene may be useful 
in the treatment of neurological disorders, such as Alzheimer's disease, Parkinson's disease, 
schizophrenia, multiple sclerosis, stroke and epilepsy. 

Moderate levels of expression are also seen in samples from lung, colon, ovarian 
and prostate cancer cell lines. This suggests that expression of this gene could be used as a 
marker of these cancers. Futhermore, therapeutic modulation of the expression or function 
of this gene may be useful in the treatment of these cancers. 

Ag5244 Expression of this gene is low/undetectable (CTs > 35) across all of the 
samples on this panel. 
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General_screening_panel_vl.6 Summary: A^^^MMtBtkf/^^ 7 3 
Multiple experiments with three different probe and primer sets produce results that are in 
very good agreement. Highest expression is seen in a lung cancer cell line and the fetal 
brain (CTs=27-32). Overall, expression is in excellent agreement with Panel 1.5, with 
5 prominent expression seen in all regions of the CNS, and lung and prostate cancer cell 
lines. Please see Panel 1 .5 for further discussion of this gene. 

Ag5244 Expression of this gene is low/undetectable (CTs > 35) across all of the 
samples on this panel. 

Panel 4 JD Summary: Ag5242/Ag5243/Ag5247/Ag5248 Multiple experiments 
10 with four different probe and primers sets show highest expression of this gene in primary 
activated Thl cells and resting neutrophils (CTs=27-31). Since this gene is expressed 
predominantly in activated Th-1 vs Th-2 cells, regulation of the expression of this gene 
might also be important for autoimmune disease such as rheumatoid arthritis (please see 
also AI panel). Moderate levels of expression are also seen in IL-4 treated lung fibroblasts 
15 and resting neutrophils. Thus, therapeutic regulation of the transcript or the protein encoded 
by the transcript could be important in immune modulation and in the treatment of T 
cell-mediated diseases such as asthma, arthritis, psoriasis, IBD, and lupus. 

Ag5245 Highest expression of this gene is seen in BL-4 treated lung fibroblasts 
(CT=32). Low but significant expression is also seen in TNF-a/ELl-b treated lung 
20 fibroblasts and primary activated Thl cells. Three experiments with the probe and primer 
set Ag5244 show low/undetectable levels of results (CTs>35). 

general oncology screening pancl_v_2.4 
Summary: Ag5242/Ag5243/Ag5247/Ag5248 Four experiments with the different probe 
and primer sets show highest expression in a lung cancers and normal kidney tissue 
25 adjacent to a tumor (CTs=31-34). Overall, this gene is expressed at low but significant 
levels in prostate cancer, normal kidney and kidney cancer, squamous cell carcinoma and 
normal colon. Therefore, therapeutic modulation of this gene or its protein product may be 
useful in the treatment of lung, prostate and kidney cancers. 

Ag5244/Ag5245 Expression of this gene is low/undetectable in all samples on this 
30 panel (CTs>35). 

AG. CG151014-01: Metabotropic glutamate receptor 3-variant 
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Expression of gene CG151014-01 was assessed ui5nf thl priy^|^6e'sW^^li^ ^ 
described in Table AGA. Results of the RTQ-PCR runs are shown in Tables AGB, AGC 
and AGD. 

Table AGA. Probe Name Ag5219 



Primers 




Length 


Start 
Position 


SEQ ID 
No 


Forward 


5 ' -tgattgtgaattgcagttcagt-3 ' 


22 


2550 


381 


Probe 


TET-5 1 -aagtgctcacgtgcagctccagaata 
-3 '-TAMRA 


26 


2598 


382 


Reverse 


5 ■ -gtactagggttgttcttttgctct-3 * 


24 


2631 


383 



Table AGB. CNS neurodegeneration vl.O 



Tissue Name 


IV CI. 

Exp.(%) 
Ag5219, 
Run 

228020421 


issue Name 


Exp.(%) 
Ag5219, 
Run 

228020421 


AD 1 Hippo 


9.4 


Control (Path) 3 Temporal Ctx 


6.5 


AD 2 Hippo 


24.8 


Control (Path) 4 Temporal Ctx 


25.0 


AD 3 Hippo 


63 


AD 1 Occipital Ctx 


15.7 


AD 4 Hippo 


7.6 


AD 2 Occipital Ctx (Missing) 


0.0 


AD 5 Hippo 


53.2 


AD 3 Occipital Ctx 


6.8 


AD 6 Hippo 


24.1 


AD 4 Occipital Ctx 


33.2 


Control 2 Hippo 


40.9 


AD 5 Occipital Ctx 


51.8 


Control 4 Hippo 


6.7 


AD 6 Occipital Ctx 


15.3 


Control (Path) 3 Hippo 


5.6 


Control 1 Occipital Ctx 


7.6 


AD 1 Temporal Ctx 


19.1 


Control 2 Occipital Ctx 


46.0 


AD 2 Temporal Ctx 


34.9 


Control 3 Occipital Ctx 


16.6 


AD 3 Temporal Ctx 


5.6 


Control 4 Occipital Ctx 


8.5 


AD 4 Temporal Ctx 


25.3 


Control (Path) 1 Occipital Ctx 


90.1 


AD 5 Inf Temporal Ctx 


100.0 


Control (Path) 2 Occipital Ctx 


11.5 


AD 5 Sup Temporal Ctx 


32.5 


Control (Path) 3 Occipital Ctx 


3.8 


AD 6 Inf Temporal Ctx 


44.1 


Control (Path) 4 Occipital Ctx 


11.9 


AD 6 Sup Temporal Ctx 


32.5 


Control 1 Parietal Ctx 


9.5 


Control 1 Temporal Ctx 


10.5 


Control 2 Parietal Ctx 


40.6 


Control 2 Temporal Ctx 


45.4 


Control 3 Parietal Ctx 


18.3 


Control 3 Temporal Ctx 


28.9 


Control (Path) 1 Parietal Ctx 


74.2 


Control 3 Temporal Ctx 


10.1 


Control (Path) 2 Parietal Ctx 


27.5 


Control (Path) 1 Temporal Ctx 


65.1 


Control (Path) 3 Parietal Ctx 


5.0 


Control (Path) 2 Temporal Ctx 


36.1 


Control (Path) 4 Parietal Ctx 


36.3 
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Table AGC. General screening panel vl.5 



5 



Tissue Name 


Pol 

Exp.(%) 
AgS219, 
Run 

228758224 


issue Name 


Rel. 

Ag5219, 
Run 

228758224 


Adipose 


0.3 


Renal ca. TK-10 


0.4 


Melanoma* Hs688(A).T 


0.0 


Bladder 


0.2 


Melanoma* Hs688(B).T 


0.0 


Gastric ca. (liver met.) NCI-N87 


6.6 


Melanoma* M14 


0.0 


Gastric ca. KATO HI 


0.0 


Melanoma* LOXIMVI 


0.5 


Colon ca. SW-948 


0.1 


Melanoma* SK-MEL-5 


0.8 


Colon ca. SW480 


0.6 


Squamous cell carcinoma SCC-4 


0.8 


Colon ca.* (SW480 met) SW620 


1.1 


Testis Pool 


0.4 


Colon ca. HT29 


0.0 


Prostate ca.* (bone met) PC-3 


2.1 


Colon ca. HCT-116 


1.7 


Prostate Pool 


0.5 


Colon ca. CaCo-2 


0.7 


Placenta 


0.0 


Colon cancer tissue 


0.0 


Uterus Pool 


0.2 


Colon ca. SW1116 


0.0 


Ovarian ca. OVCAR-3 


L0 


Colon ca. Colo-205 


0.0 


Ovarian ca. SK-OV-3 


0.9 


Colon ca. SW-48 


0.0 


Ovarian ca. OVCAR-4 


0.0 


Colon Pool 


0.7 


Ovarian ca. OVCAR-5 


0.2 


Small Intestine Pool 


0.7 


Ovarian ca. IGROV-1 


0.0 


Stomach Pool 


1.4 


Ovarian ca. OVCAR-8 


0.1 


Bone Marrow Pool 


0.1 


Ovary 


0.1 


Fetal Heart 


0.6 


Breast ca. MCF-7 


0.0 


Heart Pool 


0.3 


Breast ca. MDA-MB-231 


0.5 


Lymph Node Pool 


1.1 


Breast ca. BT 549 


0.0 


Fetal Skeletal Muscle 


0.1 


Breast ca. T47D 


0.0 


Skeletal Muscle Pool 


0.7 


oreast ca. MDA-N 


0.0 


Spleen Pool [ 


1.4 


Breast Pool 


2.6 


Thymus Pool 


0.4 


Trachea 


0.4 


CNS cancer (glio/astro) U87-MG 


1.0 


Lung 


0.2 


CNS cancer (glio/astro) U-l 18-MG 


0.1 


Fetal Lung 


0.8 


CNS cancer (neuro;met) SK-N-AS 


1.4 


Lungca. NCI-N417 


0.1 


CNS cancer (astro) SF-539 


0.0 


Lung ca. LX-1 


4.5 


CNS cancer (astro) SNB-75 


0.0 


Lung ca. NCI-H146 


1.1 


CNS cancer (glio) SNB-19 


0.0 


Lung ca. SHP-77 


3.3 


CNS cancer (glio) SF-295 


0.0 


Lung ca. A549 


0.0 


Brain (Amygdala) Pool 


60.3 


Lungca. NCI-H526 


0.3 


Brain (cerebellum) 


100.O 


Lung ca. NCI-H23 


0.4 


Brain (fetal) 


66.4 
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Lung ca. NCI-H460 


0.9 


B«dn(HipiK^>^ a0 E 




Lung ca. HOP-62 


0.0 


Cerebral Cortex Pool 


80.1 


Lungca. NCI-H522 


0.7 


Brain (Substantia nigra) Pool 


54.0 


Liver 


0.0 


Brain (Thalamus) Pool 


94.6 


Fetal Liver 


0.4 


Brain (whole) 


65.1 


Liver ca. HepG2 


0.9 


Spinal Cord Pool 


36.6 


Kidney Pool 


1.5 


Adrenal Gland 


0.6 


Fetal Kidney 


0.7 


Pituitary gland Pool 


0.9 


Renal ca. 786-0 


0.0 


Salivary Gland 


0.2 


Renal ca. A498 


0.0 


Thyroid (female) 


0.0 


Renal ca. ACHN 


1,0 


Pancreatic ca. CAPAN2 


0.1 


Renal ca. UO-31 


0.5 


Pancreas Pool 


0.9 



Table AGP. Panel 4.1D 

5 



Tissue Name 


Rel. 

Exp(%) 
Ag5219, 
Run 

229739298 


Tissue Name 


Rel. 

Exp.(%) 
Ag5219, 
Run 

229739298 


Secondary Thl act 


0.0 


HUVEC IL-lbeta 


3.3 


Secondary Th2 act 


3.2 


HUVEC IFN gamma 


14.0 


Secondary Trl act 


0.0 


HUVEC TNF alpha + IFN gamma 


2.9 


Secondary Thl rest 


0.0 


HUVEC TNF alpha + IL4 


1.8 


Secondary Th2 rest 


0.0 


HUVEC BL-11 


21.8 


Secondary Trl rest 


2.9 


Lung Microvascular EC none 


100.0 


Primary Thl act 


0.0 


Lung Microvascular EC TNFalpha 
+ IL-lbeta 


31.9 


Primary Th2 act 


5.8 


Microvascular Dermal EC none 


0.0 


Primary Trl act 


0.0 


Microsvasular Dermal EC 
TNFalpha + IL-lbeta 


15.5 


Primary Thl rest 


0.0 


Bronchial epithelium TNFalpha + 
ILlbeta 


0.0 


Primary Th2 rest 


1.8 


Small airway epithelium none 


0.0 


Primary Trl rest 


4.7 


Small airway epithelium TNFalpha 
+ IL-lbeta 


3.4 


CD45RA CD4 lymphocyte act 


0.0 


Coronery artery SMC rest 


2.3 


CD45RO CD4 lymphocyte act 


11.1 


Coronery artery SMC TNFalpha + 
IL-lbeta 


0.0 


CD8 lymphocyte act 


6.7 


Astrocytes rest 


0.0 


Secondary CD8 lymphocyte rest 


5.9 


Astrocytes TNFalpha + IL-lbeta 


3.4 


Secondary CD8 lymphocyte act 


0.0 


KU-812 (Basophil) rest 


4.1 


CD4 lymphocyte none 


3.3 


KU-812 (Basophil) 
PMA/ionomycm 


26.1 
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2ryThl/Th2/Trl anti-CD95 
CH11 


5.9 


T PCT/USOB 

CCD1 106 (Keratinocytes) none 


4.5 




J.VJ 


CCD1 106 (Keratinocytes) 
TNFalpha + EL-lbeta 


0.0 


LAK cells IL-2 


2.0 


Liver cirrhosis 


0.0 


LAK cel!sIL-2+IL-12 


0.0 


NCI-H292 none 


18.2 


LAK cells IL-2+IFN gamma 


3.0 


NCI-H292 IL-4 


16.7 


LAK cells IL-2+IL-18 


2.7 


NC3-H292 IL-9 


25 0 


LAK cells PMA/ionomycin 


0.0 


NCI-H292 IL-13 


48.3 


NK Cells EL-2rest 


24.1 


NCI-H292 IFN gamma 


19.9 


Two Way MLR 3 day 


3.5 


HPAEC none 


8.1 


Two Way MLR 5 day 


1.5 


HPAEC TNF alpha + IL-1 beta 


78 


Two Way MLR 7 day 


0.0 


Lung fibroblast none 


0.0 


PBMCrest 


0.0 


Lung fibroblast TNF alpha + IL-1 
beta 


2.0 


PBMCPWM 


1.0 


Lung fibroblast IL-4 


7.9 


PBMCPHA-L 


0 ft 


ujng noroDJast UL-y 


0.0 


Ramos (B cell) none 


18.2 


Lung fibroblast IL-13 


0.0 


Ramos (B cell) ionomycin 


59.9 


Lung fibroblast IFN gamma 


2.8 


b lymphocytes PWM 


4.2 


Dermal fibroblast CCD1070rest 


0.0 


B lymphocytes CD40L and IL-4 


13.2 


Dermal fibroblast CCD1070 TNF 
alpha 


0.0 


EOL-1 dbcAMP 


0.0 


Jjermai tioroolast CCD1U70 IL-l 
beta 


6.7 


EOL-1 dbcAMP 
PMA/ionomycin 


4.8 


Dermal fibroblast IFN gamma 


— 

40.6 


Dendritic cells none 


4.4 


Dermal fibroblast IL-4 


25.0 


Dendritic cells LPS 


0.0 


Dermal Fibroblasts rest 


2.1 


Dendritic cells anti-CD40 


0.0 


Neutrophils TNFa+LPS 


0.0 


Monocytes rest 


0.0 


Neutrophils rest 


0.0 


Monocytes LPS 


0.0 


Colon 


0.0 


Macrophages rest 


O.O 


Lung 


0.0 


Macrophages LPS 


D.O 


Thymus 


3.0 


HUVEC none 


1.7 


Sidney 


11.3 


HUVEC starved 


28.1 







CNS_neurodegeneration_vl.O Summary: Ag5219 This panel confirms the 
expression of this gene at low levels in the brain in an independent group of individuals. 
This gene is found to be slightly down-regulated in the temporal cortex of Alzheimer's 
disease patients. Therefore, up-regulation of this gene or its protein product, or treatment 
with specific agonists for this receptor may be of use in reversing the dementia, memory 
loss, and neuronal death associated with this disease. 
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GeneraLscreening_paneI_yl.5 Summary; Ag^^S^eM" WL&^^^h^ ^ 3 
gene is deted in cerebellum (CT=27). High expression of this gene is mainly seen in all the 
region of central nervous system examined, including amygdala, hippocampus, substantia 
nigra, thalamus, cerebellum, cerebral cortex, and spinal cord. Therefore, therapeutic 
modulation of this gene product may be useful in the treatment of central nervous system 
disorders such as Alzheimer's disease, Parkinson's disease, epilepsy, multiple sclerosis, 
schizophrenia and depression. 

In addition, moderate to low levels of expression of this gene is also seen in a 
number of cancer cell lines derived from brain, colon, gastric, lung, ovarian, and prostate 
cancers, squamous cell carcinoma and melanoma. Therefore, therapeutic modulation of this 
gene may be useful in the treatment of these cancers. 

Low levels of expression of this gene is also seen in tissues with 
metabolic/endocrine functions including pancreas, adrenal and pituitary cancers, fetal heart, 
skeletal muscle and gastrointestinal tract. Therefore, therapeutic modulation of the activity 
of this gene may prove useful in the treatment of endocrine/metabolically related diseases, 
such as obesity and diabetes. 

Panel 4.1D Summary: Ag5219 Highest expression of this gene is detected in lung 
microvascular endothelial cells (CT=32.4). This gene is expressed at lower levels in 
cytokine activated lung microvascular cells, activated dermal fibroblasts, resting and 
activated mucoepidermoid NCI-H292, activated basophils, starved and IL-1 1 stimulated 
HUVEC cells, Ramos B cells, and resting TL-2 treated NK cells. Therefore, therapeutic 
modulation of this gene may be useful in the treatment of autoimmune and inflammatory 
diseases such as asthma, allergies, inflammatory bowel disease, lupus erythematosus, 
psoriasis, rheumatoid arthritis, and osteoarthritis. 

AH. CG151014-02 and CG151014-03: Metabotropic glutamate 
receptor 3. 

Expression of gene CG151014-02 and CG151014-02 was assessed using the 
primer-probe set Ag5220, described in Table AHA. Results of the RTQ-PCR runs are 
shown in Tables AHB and AHC. Please note that CG151014-03 represents a full-length 
physical clone. 
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Table AHA. Probe Name Ag5220 



Primers 




Length 


Start 
Position 


SEQ ID 
No 


Forward 


5 ' -atcaacttcacgggtgcag-3 9 


19 


1399 


384 


Probe 


TET-5 ' -ctttgtggtcttgggctgtttgtttg 
-3 • -TAMRA 


26 


1453 


385 


Reverse 


5 ' -caggatgatgtgaaccttgg-3 ' 


20 


1482 


386 



Table AHB.CNS neurodegeneration vl.O 



i issue rName 


Rel. 

Exp.(%) 
Ag5Z20, 
Run 

228020422 


issue Name 


Rel. 

Exp.(%) 
Ag5220, 

Run 

i\un 

228020422 


AD 1 Hippo 


2.0 


Control (Path) 3 Temporal Ctx 


5.8 


AD 2 Hippo 


49.0 


Control (Path) 4 Temporal Ctx 


25.2 


AD 3 Hippo 


1.0 


AD 1 Occipital Ctx 


5.6 


AD 4 Hippo 


13.5 


AD 2 Occipital Ctx (Missing) 


0.0 


AD 5 hippo 


35.4 


AD 3 Occipital Ctx 


3.1 


AD 6 Hippo 


59.9 


AD 4 Occipital Ctx 


24.7 


Control 2 Hippo 


34.2 


AD 5 Occipital Ctx 


17.2 


Control 4 Hippo 


7.0 


AD 6 Occipital Ctx 


61.6 


Control (Path) 3 Hippo 


4.4 


Control 1 Occipital Ctx 


2.6 


AD 1 Temporal Ctx 


6.0 


Control 2 Occipital Ctx 


43.2 


AD 2 Temporal Ctx 


39.2 


Control 3 Occipital Ctx 


10.2 


AD 3 Temporal Ctx 


2.4 


Control 4 Occipital Ctx 


9.0 


AD 4 Temporal Qx 


29.9 


Control (Path) 1 Occipital Ctx 


100.0 


AD 5 Inf Temporal Ctx 


76.3 


Control (Path) 2 Occipital Ctx 


7.7 


AD 5 SupTemporal Ctx 


29.9 


Control (Path) 3 Occipital Ctx 


2.1 


AD 6 Inf Temporal Ctx 


60.3 


Control (Path) 4 Occipital Ctx 


14.2 


AD 6 Sup Temporal Ctx 


69.3 


Control 1 Parietal Ctx 


7.0 


Control 1 Temporal Ctx 


13.2 


Control 2 Parietal Ctx 


24.3 


Control 2 Temporal Ctx 


52.9 


Control 3 Parietal Ctx 


15.4 


Control 3 Temporal Ctx 


23.3 


Control (Path) 1 Parietal Ctx 


89.5 


Control 4 Temporal Ctx 


11.7 


Control (Path) 2 Parietal Ctx 


15.2 


Control (Path) 1 Temporal Ctx 


87.1 


Control (Path) 3 Parietal Ctx 


6.4 


Control (Path) 2 Temporal Ctx 


59.0 


Control (Path) 4 Parietal Ctx 


33.0 
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Table AHC. General screening panel vl.5 





Rel. 




Rel. 


Tissue Name 


Exp.(%) 
Run 

228758228 


issue Name 


Exp.(%) 
Run 

228758228 


Adipose 


0.0 


Renal ca. TK-10 


0.0 


Melanoma* Hs688(A).T 


0.0 


Bladder 


0.0 


Melanoma* Hs688(B).T 


0.0 


Gastric ca. Giver met) NCI-N87 


0.0 


Melanoma* M14 


0.0 


Gastric ca. KATO m 


0.0 


Melanoma* LOXIMVI 


0.0 


Colon ca. SW-948 


0.0 j 


Melanoma* SK-MEL-5 


0.0 


Colon ca. SW480 


0.0 


Squamous cell carcinoma SCC-4 


0.0 


Colon ca * (SW48Q met) SW620 


0.0 


Testis Pool 


0.0 


Colon ca. HT29 


0.0 


Prostate ca.* (bone met) PC-3 


0.0 


Colon ca.HCT-1 16 


0.0 


Prostate Pool 


0.0 


Colon ca. CaCo-2 


0.0 


Placenta 


0.0 


Colon cancer tissue 


0.0 


Uterus Pool 


0.0 


Colon ca.SWl 116 


0.0 


Ovarian ca. OVCAR-3 


0.0 


Colon ca. Colo-205 


0.0 


Ovarian ca SK-OV-3 


0.0 


Colon ca. SW-48 


0.0 


Ovarian ca OVCAR-4 


0.0 


Colon Pool 


0.0 


Ovarian ca OVCAR-5 


0.0 


Small Intestine Pool 


0.0 


Ovarian ca IGROV-1 


0.0 


Stomach Pool 


1.6 


Ovarian ca. OVCAR-8 


0.0 


Bone Marrow Pool 


0.0 


Ovary 


0.0 


Fetal Heart 


0.0 


Breast ca. MCF-7 


0.0 


Heart Pool 


do 


Breast ca. MDA-MB-231 


0.0 


Lymph Node Pool 


0.7 


Breast ca. BT 549 


0.0 


Fetal Skeletal Muscle 


0.0 


Breast ca. T47D 


0.0 


Skeletal Muscle Pool 


0.0 


Breast ca. MDA-N 


0.0 


Spleen Pool 


0.0 


Breast Pool 


2.3 


Thymus Pool 


0.0 


Trachea 


0.0 


CNS cancer (glio/astro) U87-MG 


0.0 


Lung 


0.0 


CNS cancer (glio/astro) U-l 18-MG 


0.0 


Fetal Lung 


0.0 


CNS cancer (neuro;met) SK-N-AS 


0.0 


Lung ca. NCI-N417 


0.0 


CNS cancer (astro) SF-539 


0.0 


Lung ca. LX-1 


0.0 


CNS cancer (astro) SNB-75 


0.0 


Lung ca. NCI-H146 


0.0 


CNS cancer (glio) SNB-19 


0.0 


Lung ca. SHP-77 


0.0 


CNS cancer (glio) SF-295 


0.0 


Lung ca. A549 


0.0 


Brain (Amygdala) Pool 


75.8 


Lung ca. NCI-H526 


0.0 


Brain (cerebellum) 


100.0 


Lung ca. NCI-H23 


0.0 


Brain (fetal) 


69.3 


Lung ca. NCI-H460 


0.2 


Brain (Hippocampus) Pool 


53.2 


Lung ca. HOP-62 


0.0 


Cerebral Cortex Pool 


72.2 
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Lung ca. NCI-H522 


0.0 


Brain (Substantia nigra) Pool 


80.7 


Liver 


0.0 


Brain (Thalamus) Pool 


96.6 


Ppffil T 1VPT 




tsrain ^wnoie; 


78.5 


Liver ca. HepG2 


0.0 


Spinal Cord Pool 


25.0 


Kidney Pool 


0.0 


Adrenal Gland 


4.3 


Fetal Kidney 


0.5 


Pituitary gland Pool 


0.0 


Renal ca. 786-0 


0.0 


Salivary Gland 


0.0 


Renal ca. A498 


0.0 


Thyroid (female) 


0.0 


Renal ca. ACHN 


0.0 


Pancreatic ca. CAPAN2 


0.0 


Renal ca. UO-31 


0.0 


Pancreas Pool 


0.0 



CNS_neurt>degeneration_vl.O Summary: Ag5220 This panel confirms the 
expression of this gene at low levels in the brains of an independent group of individuals: 
5 However, no differential expression of this gene was detected between Alzheimer's 

diseased postmortem brains and those of non-demented controls in this experiment. Please 
see Panel 1.5 for a discussion of this gene in treatment of centra] nervous system disorders, 

G*neraLscreening_panel_vl.5 Summary: Ag5220 Highest expression of this 
gene is deted in cerebellum (CT=27). High expression of this gene is mainly seen in all the 
10 region of central nervous system examined, including amygdala, hippocampus, substantia 
nigra, thalamus, cerebellum, cerebral cortex, and spinal cord. Therefore, therapeutic 
modulation of this gene product may be useful in the treatment of central nervous system 
disorders such as Alzheimer's disease, Parkinson's disease, epilepsy, multiple sclerosis, 
schizophrenia and depression. 

15 Panel 4.1D Summary: Ag5220 Expression of this gene is low/undetectable (CTs 

> 35) across all of the samples on this panel. 

AI. CG151297-01: CALMODULIN-DEPENDENT 
PHOSPHODIESTERASE. 

Expression of gene CG151297-01 was assessed using the primer-probe set Ag7165, 
20 described in Table ALA. Results of the RTQ-PCR runs are shown in Table AIB. Please note 
that CG151297-01 represents a full-length physical clone. 
Table AIA. Probe Name Ag7165 



Primers 


Sequencs 


Length 


Start 
Position 


SEQID 
No 
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Forward 


5 ' -agaatgtaccgaaaaacattttctct-3 ' ^ 








Probe 


TET-5 ' -ttcctcttatagaggaagcctcaaaag 
ccg-3 ' -TAMRA 


30 


536 


388 


Reverse 


5 ' -tgcttgccacataggaagaa-3 1 


20 


570 


389 



Table AIB. Panel 4.1 D 



5 



Tissue Name 


Rel. 
Ex.(%) 
Ag7165, 
Run 


Tissue Name 


Rel, 

Exp.(%) 
Ag7165, 
Run 

307719896 


octunuaiy i ni oci 




riuvtic iL-lbeta 


0.0 


Secondary Th2 act 


0.0 


HUVEC IFN gamma 


0.0 


Secondary Trl act 


0.0 


HUVECTNF alpha + IFN gamma 


0.0 


Secondary Thl rest 


0.0 


HUVEC TNF alpha + DL4 


0.0 


Secondary Th2 rest 


0.0 


HUVEC IL-11 


0.0 


Secondary Trl rest 


0.0 


Lung Microvascular EC none 


0.0 


Primary Thl act 


0.0 


Lung Microvascular EC TNFalpha 
+ IL-lbeta 


0.0 


Primary Th2 act 


0.0 


Microvascular Derma! EC none 


0.0 


Primary Trl act 


0.0 


Microsvasular Dermal EC 
TNFalpha + IL-lbeta 


0.0 


Primary Thl rest 


0.0 


Bronchial epithelium TNFalpha + 
ELlbeta 


0.0 


Primary Th2 rest 


0.0 


Small airway epithelium none 


0.0 


Primary Trl rest 


0.0 


Small airway epithelium TNFalpha 
+ iL-iDeta 


0.0 


CD45RA CD4 lymphocyte act 


0.0 


Coronery artery SMC rest 


0.0 


CD45RO CD4 lymphocyte act 


0.0 


Coronery artery SMC TNFalpha + 
IL-lbeta 


0.0 


CD8 lymphocyte act 


0.0 


Astrocytes rest 


0.0 


Secondary CD8 lymphocyte rest 


O0 


Astrocytes TNFalpha + IL-lbeta 


0.0 


Secondary CD8 lymphocyte act 


0.0 


KU-812 (Basophil) rest 


0.0 


CD4 lymphocyte none 


0.0 


KU-812 (Basophil) 
PMA/ionomycin 


0.0 


2ry Thl/Th2/Trl_anti-CD95 
CH11 


0.0 


CCD1106 (Keratinocytes) none 


0.0 


LAK cells rest 


0.0 


CCD1106 (Keratinocytes) 
TNFalpha + IL-lbeta 


0.0 


LAK cells IL-2 


0.0 


Liver cirrhosis 


100.0 


LAK cells IL-2+1L-12 


0.0 


NCI-H292none 


0.0 


LAK cells IL-2+IFN gamma 


0.0 


NCI-H292 DL-4 


0.0 


LAK cells IL-2+ IL-18 


0.0 


NCI-H292 1L-9 


0.0 


LAK cells PMA/ionomycin 


0.0 


NCI-H292 IL-13 


0.0 
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NK Cells IL-2rest 


0.0 


NCI-m92ir¥ilm; OGO g-' 




Two Way MLR 3 day 


0.0 


HPAEC none 


0.0 | 


Two Way MLR 5 day 


0.0 


HPAEC TNF alpha + IL-1 beta 


0.0 


Two Way MLR 7 day 


0.0 


Lung fibroblast none 


0.0 


PBMC rest 


0.0 


Lung fibroblast TNF alpha + IL-1 
beta 


0.0 


PBMCPWM 


0.0 


Lung fibroblast IL-4 


0.0 


PBMC PHA-L 


0.0 


T nnp fibroblast TT -0 

J^UUg UlSlWJtlal JUL* 7 




Ramos (B cell) none 


0.0 


Lung fibroblast IL-13 


0.0 


Ramos (B cell) ionomycin 


0.0 


Lung fibroblast IFN gamma 


0.0 


n lymphocytes rWM 


0.0 


Dermal fibroblast CCD1070 rest 


0.0 


B lymphocytes CD40L and IL-4 


0.0 


Dermal fibroblast CCD1070 TNF 
alpha 


0.0 


EOL-1 dbcAMP 


0.0 


FWrnal -fihrrtVilact Cr r T\'l (Y7f\ TT 1 

jL/erjiidi liuroDidSL vA^uiyj /v LL.-1 
beta 


0.0 


EOL-1 dbcAMP 
PMA/ionomycin 


0.0 


Dermal fibroblast IFN gamma 


0.0 


Dendritic cells none 


0.0 


Dermal fibroblast IL-4 


0.0 


Dendritic cells LPS 


0.0 


Dermal Fibroblasts rest 


o.o 


Dendritic cells anti-CD40 


0.0 


Neutrophils TNFa+LPS 


0.0 


Monocytes rest 


0.0 


Neutrophils rest 


0.0 


Monocytes LPS 


0.0 


Colon 


0.0 


Macrophages rest 


0.0 


Lung 


0.0 


Macrophages LPS 


0.0 


Thymus 


0.0 


HUVEC none 


0.0 


Kidney 


0.0 


HUVEC starved 


0.0 







CNS_neurodegeneration_vl.O Summary: Ag7165 Expression of this gene is 
low/undetectable (CTs > 35) across all of the samples on this panel. 

Panel 4.1D Summary: Ag7165 Moderate level of expression of this gene is 
detected mainly in the liver cirrhosis sample (CT=31.5). The presence of this gene in liver 
cirrhosis (a component of which involves liver inflammation and fibrosis) suggests that 
antibodies to the protein encoded by this gene could also be used for the diagnosis of liver 
cirrhosis. Furthermore, therapeutic agents involving this gene may be useful in reducing or 
inhibiting the inflammation associated with fibrotic and inflammatory diseases. 

AJ. CG152256-01: Phosphatidylserine synthase. 
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Expression of gene CG152256-01 was assessed dSnlfg: tfte^pMl^H^seTA^I^r" 3 ' 
described in Table AJA. Results of the RTQ-PCR runs are shown in Tables AJB, AJC and 
AJD. 

Table AJA. Probe Name Ag6718 

5 



Primers 




Length 


Start 
Position 


SEQ ID 
No 


Forward 


5 1 -gagcctcgcttccgattat-3 ' 


19 


2012 


390 


Probe 


TET-5 » -tcccttcccaatattattcatccaga 
-3 ' -TAMRA 


26 


2031 


391 


Reverse 


5 ' -ctctagcaggtttgcttttgtg-3 ■ 


22 


2070 


392 



Table AJB. CNS neurodegeneration vl.O 



Tissue Name 


ReL 

Exp.(%) 
Ag6718, 
Run 

276596848 


issue Name 


ReL 

Exp.(%) 
Ag6718, 
Run 

276596848 


AD 1 Hippo 


19.8 


Control (Path) 3 Temporal Ctx 


2.6 


AD 2 Hippo 


26.6 


Control (Path) 4 Temporal Ctx 


15.3 


AD 3 Hippo 


4.3 


AD 1 Occipital Ctx 


9.9 


AD 4 Hippo 


3.7 


AD 2 Occipital Ctx (Missing) 


0.0 


AD 5 Hippo 


58.6 


AD 3 Occipital Ctx 


7.1 


AD 6 Hippo 


45.4 


AD 4 Occipital Ctx 


15.9 


Control 2 Hippo 


28.5 


AD 5 Occipital Ctx 


26.6 


Control 4 Hippo 


8.4 


AD 6 Occipital Ctx 


15.1 


Control (Path) 3 Hippo 


3.1 


Control 1 Occipital Ctx 


3.6 


AD 1 Temporal Ctx 


4.8 


Control 2 Occipital Ctx 


67.4 


AD 2 Temporal Ctx 


24.7 


Control 3 Occipital Ctx 


31.2 


AD 3 Temporal Ctx 


7.5 


Control 4 Occipital Ctx 


1.8 


AD 4 Temporal Ctx 


10.5 


Control (Path) 1 Occipital Ctx 


100.0 


AD 5 Inf Temporal Ctx 


62.9 


Control (Path) 2 Occipital Ctx 


9.5 


AD 5 Sup Temporal Ctx 


46.3 


Control (Path) 3 Occipital Ctx 


5.3 


AD 6 Inf Temporal Ctx 


43.5 


Control (Path) 4 Occipital Ctx 


10.0 


AD 6 Sup Temporal Ctx 


43.2 


Control 1 Parietal Ctx 


3.8 


Control 1 Temporal Ctx 


4.1 


Control 2 Parietal Ctx 


27.9 ! 


Control 2 Temporal Ctx 


59.0 


Control 3 Parietal Ctx 


15.0 


Control 3 Temporal Ctx 


17.6 


Control (Path) 1 Parietal Ctx 


89.5 


Control 3 Temporal Ctx 


5.0 


Control (Path) 2 Parietal Ctx 


10.2 


Control (Path) 1 Temporal Ctx 


57.0 


Control (Path) 3 Parietal Ctx 


7.0 


Control (Path) 2 Temporal Ctx 


30.4 


Control (Path) 4 Parietal Ctx 


27.9 
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Table A.TC. General screening panel vl.6 



Tissue Name 


Kel. 

Ag6718, 
Run 

277223813 


issue Name 


Rel. 

Ag6718, 
Run 

277223813 


Adipose 


2.3 


Renal ca. TK-10 


34.4 


Melanoma* Hs688(A).T 


16.4 


Bladder 


22.2 


Melanoma* Hs688(B).T 


20.0 


Gastric ca. (liver met.) NCI-N87 


54.0 


Melanoma* M14 


30.6 


Gastric ca. KATO HI 


48.3 


Melanoma* LOX1MVI 


55.1 


Colon ca. SW-948 


31.0 


Melanoma* SK-MEL-5 


81.8 


Colon ca.SW480 


87.1 


Squamous cell carcinoma SCC-4 


23.5 


Colon ca * (SW480met) SW620 


69.7 


Testis Pool 


5.2 


Colon ca. HT29 


0.0 


Prostate ca.* (bone met) PC-3 


100.0 


Colon ca.HCT-1 16 


51.4 


Prostate Pool 


1.8 


Colon ca. CaCo-2 


15.9 


Placenta 


2.6 


Colon cancer tissue 


23.5 


Uterus Pool 


0.8 


Colon ca. SW1116 


25.0 


Ovarian ca. OVCAR-3 


27.4 


Colon ca. Colo-205 


21.9 


Ovarian ca. SK-OV-3 


29.9 


Colon ca. SW-48 


24.1 


Ovarian ca. OVCAR-4 


33.0 


Colon Pool 


12.4 


Ovarian ca. OVCAR-5 


59.9 


Small Intestine Pool 


4.8 


Ovarian ca. IGROV-1 


47.6 


Stomach Pool 


1.8 


Ovarian ca. OVCAR-8 


32.8 


Bone Marrow Pool 


0.0 


Ovary 


11.7 


Fetal Heart 


14.2 


Breast ca. MCF-7 


18.9 


Heart Pool 


11.6 


Breast ca. MDA-MB-231 


48.0 


Lymph Node Pool 


3.8 


Breast ca. BT 549 


31.6 


Fetal Skeletal Muscle 


3.3 


Breast ca.T47D 


3.6 


Skeletal Muscle Pool 


0.0 


Breast ca. MDA-N 


17.9 


Spleen Pool 


2.0 


Breast Pool 


7.0 


Thymus Pool 


11.7 


Trachea 


9.2 


CNS cancer (glio/astro) U87-MG 


32.3 


Lung 


2.4 


CNS cancer (glio/astro) U-118-MG 


43.2 


Fetal Lung 


4.9 


CNS cancer (neuro;met) SK-N-AS 


25.9 


Lung ca. NCI-N417 


15.0 


CNS cancer (astro) SF-539 


29.5 


Lung ca. LX-1 


17.6 


CNS cancer (astro) SNB-75 


59.0 


Lungca. NCI-H146 


23.7 


CNS cancer (glio) SNB-19 


29.7 


Lung ca. SHP-77 


53.2 


CNS cancer (glio) SF-295 


59.5 


Lung ca. A549 


28.3 


Brain (Amygdala) Pool 


10.4 


Lungca.NCI-H526 


24.3 


Brain (cerebellum) 


34.4 


Lungca. NCI-H23 


71.7 


Brain (fetal) 


17.3 
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Lung ca. NCI-H460 


14.2 


Brain (Hippocampus) Pool 


■"li .11 — l U -"TP '""ft 


Lung ca. HOP-62 


32.3 


Cerebral Cortex Pool 


7.4 


Lung ca. NCI-H522 


16.4 


Brain (Substantia nigra) Pool 


3.9 


Liver 


1.0 


Brain (Thalamus) Pool 


6.9 


Fetal Liver 


2.3 


Brain (whole) 


6.5 


Liver ca. HepG2 


19.2 


Spinal Cord Pool 


5.6 


Kidney Pool 


15.2 


Adrenal Gland 


10.3 


Fetal Kidney 


4.1 


Pituitary gland Pool 


1.1 


Renal ca. 786-0 


61.6 


Salivary Gland 


3.2 


Renal ca. A498 


5.6 


Thyroid (female) 


11.5 


Renal ca. ACHN 


24.7 


Pancreatic ca. CAPAN2 


28.1 


Renal ca. UO-31 


33.9 


Pancreas Pool 


8.3 



Table A.TD. Panel 4.1P 

5 



Tissue Name 


Rel. 

Ex.(%) 
Ag6718, 

276596888 


Tissue Name 


Rel. 

Exp.(%) 
Ag6718, 

Run 

276596888 


Secondary Thl act 


51.4 


HUVEC IL-lbeta 


18.0 


Secondary Th2 act 


39.5 


HUVEC IFN gamma 


165 


Secondary Trl act 


19.3 


HUVEC TNF alpha + IFN gamma 


4.5 


Secondary Thl rest 


5.3 


HUVEC TNF alpha + IL4 


3.1 


Secondary Th2 rest 


4.5 


HUVEC IL-11 


0.0 


Secondary Trl rest 


5.9 


Lung Microvascular EC none 


13.9 


Primary Thl act 


3.5 


Lung Microvascular EC TNFalpha 
+ IL-lbeta 


0.7 


Primary Th2 act 


20.7 


Microvascular Dermal EC none 


3.0 


Primary Trl act 


12.8 


Microsvasular Dermal EC 
TNFalpha + IL-lbeta 


1.2 


Primary Thl rest 


1.6 


Bronchial epithelium TNFalpha + 
ILlbeta 


5.8 


Primary Th2rest 


5.8 


Small airway epithelium none 


6.3 


Primary Trl rest 


0.7 


Small airway epithelium TNFalpha 
+ IL-lbeta 


9.7 


CD45RA CD4 lymphocyte act 


26.4 


Coronery artery SMC rest 


7.1 


CD45RO CD4 lymphocyte act 


30.8 


Coronery artery SMC TNFalpha + 
IL-lbeta 


8.4 


CD8 lymphocyte act 


7.6 


Astrocytes rest 


3.3 


Secondary CD8 lymphocyte rest 


6.3 


Astrocytes TNFalpha + IL-lbeta 


2.9 


Secondary CD8 lymphocyte act 


1.5 


KU-812 (Basophil) rest 


44.8 


CD4 lymphocyte none 


3.6 


KU-812 (Basophil) 
PMAyionomycin 


28.1 
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Zry I ni/lnZ/ iXiJauU-K^UyD 

CH11 


2.9 


PCT/USDBj 

CCD 1 106 (Keratinocytes) none 


/ 3.1 37] 

27.5 


LAK cells rest 


4.5 


CCD1 106 (Keratinocytes) 
TNFalpha + IL-lbeta 


5.1 


LAK cells IL-2 


9.9 


Liver cirrhosis 


0.8 


LAK cells IL-2+IL-12 


0.7 


NCI-H292 none 


8.0 


LAK cells IL-2+BFN gamma 


4.2 


NCI-H292IL-4 


10.2 


LAK cells EL-2+EL-18 


1.4 i 


NCI-H292 IL-9 


19.2 


LAK cells PMA/ionomycin 


18.7 


NCI-H292 IL-13 


14.8 


NK Cells IL-2 rest j 


21.0 


NCI-H292 IFN gamma 


6.8 


Two Way MLR 3 day 


7.6 


HPAEC none 


3.7 


Two Wav MLR 5 dav 


5.2 


HPAEC TNF alpha + IL-1 beta 


8.5 


Two Wav MLR 7 dav 


4.3 


Lung fibroblast none 


6.8 


PBMC rest 


1.4 


Lung fibroblast TNF alpha + IL-1 
beta 


1.9 


PBMCPWM 


3.0 


Lung fibroblast IL-4 


6.1 


PBMC PHA-L 


A 1 

4.1 


Juung iiuroDiast iL-y 


1 A A 


Ramos (B cell) none 


42.9 


Lung fibroblast IL-13 


7.7 


Ramos (B cell) ionomycin 


22.1 


Lung fibroblast EFN gamma 


16.4 


B lymphocytes PWM 


10.8 


Dermal fibroblast CCD 1070 rest 


33.9 


B lymphocytes CD40L and IL-4 


12.2 


Dermal fibroblast CCD1070 TNF 
alpha 


100.0 


EOL-1 dbcAMP 


39.0 


Dermal tibroblast LLD1U/U IL-1 
beta 


17.4 


POT -1 HhrAMP 

PMA/ionomycin 


14.1 


Dermal fibroblast IFN gamma 


6.7 


Dendritic cells none 


13.5 


Dermal fibroblast IL-4 


10.4 


Dendritic cells LPS 


2.5 


Dermal Fibroblasts rest 


6.9 


Dendritic cells anti-CD40 


4.5 


Neutrophils TNFa+LPS 


0.4 


Monocytes rest 


0.6 


Neutrophils rest 


0.7 


Monocytes LPS 


3.9 


Colon 


0.8 


Macrophages rest 


1.4 


Lung 


0.6 


Macrophages LPS 


3.8 


Thymus 


2.9 


HUVEC none 


11.1 


Kidney 


8.1 


HUVEC starved 


6.4 







3! 



CNS_neurodegeneration_vl.O Summary: Ag6718 This panel confirms the 
expression of this gene at low levels in the brains of an independent group of individuals. 
5 However, no differential expression of this gene was detected between Alzheimer's 

diseased postmortem brains and those of non-demented controls in this experiment Please 
see Panel 1 .6 for a discussion of this gene in treatment of central nervous system disorders. 
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GeneraLscreening.paneLvl.6 Summary: Ag^S^^eWSPsS^^lIri? 7 • 
gene is detected in prostate cancer PC3 cell line (CT=31.9). Moderate levels of expression 
of this gene is also seen in cluster of cancer cell lines derived from pancreatic, gastric, 
colon, lung, liver, renal, breast, ovarian, prostate, squamous cell carcinoma, melanoma and 
5 brain cancers. Thus, expression of this gene could be used as a marker to detect the 
presence of these cancers. Furthermore, therapeutic modulation of the expression or 
function of this gene may be effective in the treatment of pancreatic, gastric, colon, lung, 
liver, renal, breast, ovarian, prostate, squamous cell carcinoma, melanoma and brain 
cancers. 

10 In addition, this gene is expressed at low levels in cerebellum and fetal brain. 

Therefore, therapeutic modulation of this gene product may be useful in the treatment of 
central nervous system disorders such as ataxia and autism. 

Panel 4.1D Summary: Ag6718 Highest expression of this gene is detected in TNF 
alpha treated dermal fibroblasts (CT=32). Moderate to low levels of expression of this gene 

15 is detected in activated polarized, naive and memory T cells, PMA/ionomycin treated LAK 
cells, resting TL-2 treated NK cells, Ramos B cells, eosinophils, activated HUVEC cells, 
lung microvascular endothelial cells, basophils and activated mucoepidermoid NCI-H292 
cells. Therefore, therapeutic modulation of this gene or its protein product may lead to the 
alteration of functions associated with these cell types and lead to improvement of the 

20 symptoms of patients suffering from autoimmune and inflammatory diseases such as 

asthma, allergies, inflammatory bowel disease, lupus erythematosus, psoriasis, rheumatoid 
arthritis, and osteoarthritis. 

AK* CG173017-01: RETINOIC ACID RECEPTOR 
RXR-BETA. 

25 Expression of gene CG173017-01 was assessed using the primer-probe set Ag7565, 

described in Table AKA. 

Table AKA. Probe Name Ag7565 



Primers 




Length 


Start 
Position 


SEQID 
No 


Forward 


5 ' -ctggacgggacgggat-3 ' 


16 


222 


393 


Probe 


TET-5 • -acatagccgtttgccagccccag-3 
' -TAMRA 


23 


261 


394 
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CNS_neurodegeneration_vl.O Summary: Ag7565 Expression of this gene is 
low/undetectable in all samples on this panel (CTs>35). 

Panel 4.1D Summary: Ag7565 Expression of this gene is low/undetectable in all 
samples on this panel (CTs>35). 

AL. CG173347-01: Novel Serum paraoxonase/arylesterase 3. 

Expression of gene CG173347-01 was assessed using the primer-probe set Ag7564, 
described in Table ALA. 

Table ALA. Probe Name Ae7564 



Primers 


Sequence 


Length 


Start 
Position 


SEQID 
No 


Forward 


5 1 -gaaagtggctctgaagatattgatatact- 
3' 


29 


153 


396 


Probe 


TET-5 • -tcctagtgggctggcttttatctcc- 
3 1 -TAMRA 


25 


182 


397 


Reverse 


5 • -actccaacagacctgcagact-3 ■ 


21 


207 


398 



CNS_neurodegeneration_vl.O Summary: Ag7564 Expression of this gene is 
low/undetectable in all samples on this panel (CTs>35). 

Panel 4.1D Summary: Ag7564 Expression of this gene is low/undetectable in all 
samples on this panel (CTs>35). 

AM. CG56234-02: Splice variant of PCK2. 

Expression of gene CG56234-02 was assessed using the primer-probe set Ag51 1 1 , 
described in Table AMA. Results of the RTQ-PCR runs are shown in Tables AMB, AMC, 
AMD and AME. 

Table AMA. Probe Name AeSlll 



Primers 




Length 


Start 
Position 


SEQID 
No 


Forward 


5 ' -ctgggaggccccaga-3 ' 


15 


1377 


399 
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Probe 


TET-5 ' -tgtccccattgacgccatcatc-3 ^ 
-TAMRA 


rCTVUS 

22 


p) OB./ 3 

1395 


400 


Reverse 


5 ' -gatgatcttccctttgggtct-3 1 


21 


1429 


401 



Table A MB. General screening panel vl.5 



5 



Tissue Name 

— 


Rel. 

Exp.(%) 
AgSlll, 
Run 

228980587 


issue Name 


Pol 

Kei. 

Exd.(%> 
AgSlll, 
Run 

228980587 


Adipose 


2.0 


Renal ca. TK-10 


29.1 


Melanoma* Hsoo5(A).I 


31.9 


Bladder 


12.1 


ivieianoma* rtsooolJo). 1 


28.3 


Gastric ca. (liver met.) NCI-N87 


31.4 


Melanoma* M14 


9.9 


Gastric ca. KATO m 


28.1 


melanoma* LOXIMVI 


4.5 


Colon ca. SW-948 


17.9 


Melanoma* 5K-MEL-5 


39.8 


Colon ca. S W480 


14.9 


Squamous cell carcinoma SCC-4 


4.7 


Colon ca * (SW480 met) SW620 


29.5 


J estis Fool 


1.6 


Colon ca. HT29 


8.6 


reostate ca.* (bone met) PC-3 


55.1 


Colon ca. HCT-116 


11.0 


rrostate r ooi 


0.5 


Colon ca. CaCo-2 


44.4 


Placenta 


0.3 


Colon cancer tissue 


9.7 


Uterus Pool 


0.6 


Colon ca. SW1116 


1.4 


Ovarian r*rx A\/PAD 1 

uvdnan ca. uvlak-j 


lo.O 


Colon ca. Colo-205 


6.6 


wvdndn ca. oiVHjvo 


^ "2 

D.3 


Colon ca. SW-48 


14.4 


Ovarian OVTAPjI 


/.l 


Colon Pool 


0.1 


wvanan ca. \J v lako 


■2/1 A. 

J4.0 


Small Intestine Pool | 


0.6 


wvandn ca. ivjisAj v-i 




Stomach Pool 


1.1 


Ovarian ca, OVCAR-8 


100.0 


uvixt» luoiiuw ryui 


U.J 


Ovary 


0.0 


Fetal Heart 


0.0 i 


Breast ca. MCF-7 


87.7 


Heart Pool 


0.0 


Breast ca. MDA-MB-231 


12.6 


Lymph Node Pool 


0.8 j 


Breast ca. BT 549 


75.8 


Fetal Skeletal Muscle 


0.6 j 


Breast ca. T47D 


10.1 


Skeletal Muscle Pool 


0.4 


Breast ca. MDA-N 


16.4 


Spleen Pool 


1.7 !! 


Breast Pool 


0.5 


Thymus Pool 


0.4 


Trachea 


4.3 


CNS cancer (glio/astro) U87-MG 


18.8 


Lung 


0.0 


CNS cancer (glio/astro) U-l 18-MG 


9.3 


Fetal Lung 


2.0 


CNS cancer (neuro;met) SK-N-AS 


7.5 


Lung ca. NCI-N417 


1.8 


CNS cancer (astro) SF-539 


11.3 


Lung ca. LX-1 


8.2 


CNS cancer (astro) SNB-75 


♦8.6 


Lung ca. NCI-H146 


11.1 


CNS cancer (glio) SNB-19 


31.0 


Lung ca. SHP-77 j 


11.3 


CNS cancer (glio) SF-295 


32.5 
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Lung ca. A549 


11.4 


Brain (AiJ^dtt/PboH ~ ' 


~Ji JL «Jl J m 


Lung ca. NCI-H526 


1.8 


Brain (cerebellum) 


0.3 • 


Lung ca.NCI-H23 


83.5 


Brain (fetal) 


0.3 


Lung ca. NCI-H460 


27.0 


Brain (Hippocampus) Pool 


2.5 


Lung ca. HOP-62 


1.0 i 


Cerebral Cortex Pool 


0.4 


Lung ca. NCI-H522 


67.4 


Brain (Substantia nigra) Pool 


0.6 


Liver 


6.3 


Brain (Thalamus) Pool 


1.0 


Fetal Liver 


6.7 


Brain (whole) 


0.7 


Liver ca. HepG2 


24.7 


Spinal Cord Pool 


1.1 


Kidney Pool 


0.8 


Adrenal Gland 


1.6 


Fetal Kidney 


1.0 


Pituitary gland Pool 


0.4 


Renal ca. 786-0 


8.7 


Salivary Gland 


0.9 


Renal ca. A498 


1.5 


Thyroid (female) 


0.7 


Renal ca. ACHN 


9.3 


Pancreatic ca. CAPAN2 


12.8 


Renal ca. UO-31 


1.9 


Pancreas Pool 


0.8 | 



Table AMC. General screening panel vl.6 

5 



Tissue Name 


ReL 

Exp.(%) 
Ag5111, 
Ran 

27721871 
7 


Re]. 

Exp.(%) 
AgSlll, 
Run 

27773124 
6 


Rel. 
Exp.0 
AgSlll, 
Run 

27836861 
4 


Tissue Name 


ReL 

Exp.(%) 
AgSlll, 
Run 

27721871 
7 


ReL 

Exp.(%) 
AgSlll, 
Run 

27773124 
6 


ReL ! 
Exp.(%) 
AgSlll, 
Run 

27836861 
4 


Adipose 


0.5 


0.0 


1.5 


Renal ca. 
TK-10 


24.7 


20.2 


33.0 


Melanoma* 
Hs688(A).T 


26.1 


29.5 


31.6 


Bladder 


6.7 


6.1 


11.6 


Melanoma* 
Hs688(B).T 


25.2 


32.1 


31.9 


Gastric ca. 
(liver met.) 
NCI-N87 


21.3 


22.5 


36.1 


Melanoma* 
M14 


5.6 


9.7 


7.5 


Gastric ca. 
KATOm 


14.6 


12.2 


19.2 


Melanoma* 
LOXIMVI 


3.0 


0.0 


4.2 


Colon ca. 
SW-948 


18.8 


16.5 


23.5 


Melanoma* 
SK-MEL-5 


28.7 


57.0 


39.8 


Colon ca. 
SW480 


11.8 


7.3 


19.5 


Squamous cell 

carcinoma 

SCC-4 


4.8 


4.2 


5.1 


Colon ca.* 

(SW480met) 

SW620 


23.0 


19.9 


35.6 


Testis Pool 


2.0 


0.0 


1.4 


Colon ca. 
HT29 


10.2 


4.2 


8.2 


Prostate ca * 
(bone met) 
PC-3 


33.2 


44.4 


57.8 


Colon ca. 
HCT-116 


9.6 


7.6 


19.9 
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Prostate Pool 


0.3 


0.0 


0.6 


Colonca." 
CaCo-2 


9.4 


25.0 


B aft "Tit "7 "V 

36.9 


Placenta 


0.3 


0.0 


1.1 


Colon cancer 
tissue 


6.0 


0.0 


6.6 


Uterus Pool 


0.0 


0.0 


0.6 


Colon ca. 
SW1116 


2.3 


0.0 


1.7 


Ovarian ca. 
OVCAR-3 


12.7 


8.2 


18.2 


Colon ca. 
Colo-205 


5.1 


4.7 


5.9 


Ovarian ca. 
SK-OV-3 


5.3 


6.5 


12.2 


Colon ca. 
SW-48 


9.0 


0.0 


11.6 


Ovarian ca. 


4.0 


5.2 


5.8 


Colon Pool 


0.7 


0.0 


0.7 


Ovarian ca. 


31.6 


24.8 


34.2 


Small Intestine 
Pool 


0.3 


0.0 


0.8 


Ovarian ca. 
luKUV-1 


19.2 


12.8 


27.2 


Stomach Pool 


1.2 


0.0 


2.3 


Ovarian ca. 

fW 70 ATI D 

OVCAR-8 


100.0 


100.0 


100.0 


Bone Marrow 
Pool 


0.0 


0.0 


0.0 


Ovary 


0.0 


0.0 


0.2 


Fetal Heart 


0.0 


0.0 


0.3 


Breast ca. 
MCF-7 


54.0 


51.4 


77.9 


Heart Pool 


0.4 


0.0 


0.0 


Breast ca. 
MDA-MB-231 


8.5 


7.6 


7.7 


Lymph Node 
Pool 


1.2 


0.0 


0.0 


Breast ca. BT 
549 


47.0 


30.4 


49.0 


Fetal Skeletal 
Muscle 


0.0 


0.0 


0.0 


Breast ca. T47D 


5.1 


6.5 


7.1 


Skeletal 
Muscle Pool 


0.0 


0.0 


0.0 


Breast ca. 
MDA-N 


6.1 


6.0 


24 5 


OJJ1CCI1 JTUUi 




u.u 




Breast Pool 


0.3 


0.0 


0.3 


Thymus Pool 


0.5 


0.0 


1.8 


Trachea 


3.3 


0.0 


8.3 


CNS cancer 

(glio/astro) 

U87-MG 


12.9 


7.9 


13.8 


Lung 


0.0 


0.0 


0.0 


CNS cancer 

(glio/astro) 

U-118-MG 


5.9 


4.4 


8.1 


Fetal Lung 


0.9 


0.0 


2.1 


CNS cancer 
(neuro;met) 
SK-N-AS 


6.4 


4.9 


6.7 


Lung ca. 

MOT TvM 1 *7 


1.3 


0.0 


3.8 


CNS cancer 
(astro) SF-539 


5.8 


6.4 


8.5 


Lung ca. LX-1 


5.5 


7.8 


9.5 


CNS cancer 

(astro) 

SNB-75 


25.0 


29.9 


26.8 


Lung ca. 
NCI-H146 


8.0 


8.5 


11.5 


CNS cancer 
(glio) SNB-19 


23.8 


20.7 


29.5 


Lung ca. 
SHP-77 


12.2 


14.3 


21.3 


CNS cancer 
(glio) SF-295 


38.2 


28.7 


46.7 
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Lung ca. A549 


11.5 


11.7 


15.9 


Brain PC ' 

(Amygdala) 

Pool 


r/ua 

0.8 


OE/3 
0.0 


JL .3» y ~ 
l.l 


Lung ca. 
NCI-H526 


1.8 


0.0 


1.7 


Brain 

(cerebellum) 


1.0 


0.0 


1.1 


Lung ca. 
NCI-H23 


42.6 


68.8 


55.1 


Brain (fetal) 


0.0 


0.0 


0.4 


Lung ca. 
NCI-H460 i 


16.7 


23.5 


38.4 


Brain 

(Hippocampus 
)Pool 


0.4 


0.0 


1.2 


Lung ca. 
HOP-62 


2.0 


0.0 


3.0 


Cerebral 
Cortex Pool 


0.0 


0.0 


0.6 


Lung ca. 
NCI-H522 


41.5 


64.2 


87.1 


Brain 
(Substantia 
nigra) Pool 


0.0 


0.0 


0.4 


Liver 


A A 

4.4 


4.0 


/.i 


Brain 

\ i naiamusj 
Pool 






u.u 


Fetal Liver 


5.8 


3.3 


8.7 


Brain, (whole) 


6.7 


0.0 


2.8 


Liver ca. 
Hep02 


15.7 


16.3 


18.8 


opinal Cora 
Pool 


0.6 


0.0 


0.5 


Kidney Pool 


0.7 


0.0 


0.3 


Adrenal Olana 


1 A 
1.4 


ft f\ 


1 A 

1.4 


Fetal Kidney 


0.9 


0.0 


1.0 


Pituitary gland 
Pool 


0.0 


0.0 


0.7 


Renal ca. 786-0 


9.3 


8.1 


13.8 


Salivary Gland 


0.8 


0.0 


1.8 


Renal ca. A498 


1.1 


0.0 


2.0 


Thyroid 
(female) 


1.0 


0.0 


2.1 


Renal ca. 
ACHN 


5.8 


6.0 


10.8 


Pancreatic ca. 
CAPAN2 


13.1 


9.6 


19.9 


Renal ca. 
UO-31 


2.4 


0.0 


3.3 


Pancreas Pool 


4.8 


0.0 


7.3 



Table AMD. Panel 4.1D 

5 



Tissue Name 


Rel. 

Exp.(%) 

■5111, 
Run 

226444761 


Rel. 

Exp.(%) 
Ag5111, 
Run 

276596864 


Tissue Name 


Rel. 

Exp.(%) 
AgSlll, 
Run 

226444761 


Rel. 

Exp.(%) 
AgSlll, 
Run 

276596864 


Secondary Thl act 


90.8 


58.6 


HUVECTJL-lbeta 


18.7 


10.7 


Secondary Th2 act 


40.9 


57.8 


HUVEC IFN gamma 


2.8 


6.2 


Secondary Trl act 


57.4 


16.5 


HUVEC TNF alpha + 
IFN gamma 


5.0 


6.2 


Secondary Thl rest 


27.2 


8.4 


HUVEC TNF alpha + 
IL4 


23.2 


8.8 


Secondary Th2 rest 


6.0 


0.0 


HUVEC IL-11 


23~~ 


0.0 
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Secondary Trl rest 


7.2 


4.0 


Lung MiSoVascUlar^ 
EC none 


liSOB./: 

3.2 


15.4 


Primary Thl act 


32.8 


5.0 


Lung Microvascular 
EC TNFalpha + 
IL-lbeta 


6.4 


0.0 


Primary Th2 act 


49.0 


19.9 


Microvascular 
Dermal EC none 


U.U 


U.U ! 


Primary Trl act 


50.0 


38.4 


Microsvasular 
Dermal EC 
TNFalpha + IL-lbeta 


0.0 


0.0 


Primary Thl rest 


6.0 


8.5 


Bronchial epithelium 
TNFalpha + ILlbeta 


8 7 


o 

o.y 


Primary Th2 rest 

A MMlMtAA J M. MM** M VUV 


64 


6 ^ 


Small airway 
epithelium none 


7 7 
Z.Z 


U.U 


Primary Trl rest 


18.0 


0.0 


Small airway 
epithelium TNFalpha 
+ IL-lbeta 


11.8 


0.0 


CD45RA CD4 
lymphocyte act 


95.9 


76.8 


Coronery artery SMC 
rest 


18.3 


10.2 


CD45RO CD4 
lymphocyte act 


95.3 


100.0 


Coronery artery SMC 
TNFalpha + IL-lbeta 


9.4 


8.8 


CD8 lymphocyte act 


77.4 


4.5 


Astrocytes rest 


2.1 


0.0 


Secondary CD8 
lymphocyte rest 


on 1 


17 1 


Astrocytes TNFalpha 
+ IL-lbeta 


U.U 


0.0 


Secondary CD8 
lymphocyte act 


21.0 


7.7 


KU-812 (Basophil) 
rest 


25.9 


10.2 


iTidl 1vTTTnrifVM/tA TlrtTIA 


0 0 


XJ.KJ 


KU-812 (Basophil) 
PMA/ionomycin 


zo.o 


21.2 


2ry 

Thl/Th2/Trl_anti-CD95 
CHll 


5.4 


0.0 


CCD1106 

(Keratinocytes) none 


15.2 


4.9 


LAK cells rest 


43.5 


19.9 


CCD 1106 
(Keratinocytes) 
iiMrajpna + iL-iDeta 


9.0 


12.3 


LAK cells lL-2 


52.1 


18.4 


Liver cirrhosis 


8.3 


0.0 


I ATT 11~ TT O . TT n 

LAK cells 1L-2+IL-12 


33.7 


0.0 


NCI-H292 none 


15.3 


3.4 


LAK cells IL-z-flFN 
gamma 


57.0 


6.6 


NCI-H292 IL-4 


13.5 


17.2 


LAK Cells 1L-Z4- IL-lo 


46.0 


9.5 


NCI-H292 IL-9 


14.2 


14.1 


LAK cells 
PMA/ionomycin 


43.5 


24.5 


NCI-H292 EL-13 


29.1 


11.3 


NK Cells DL-2 rest 


60.7 


37.4 


NCI-H292 IFN 
gamma 


44.8 


7.2 


Two Way MLR 3 day 


32.1 


10.3 


HPAECnone 


2.0 


0.0 


Two Way MLR 5 day 


53.2 


3.6 


HPAECTNF alpha + 
IL-1 beta 


7.2 


7.0 


Two Way MLR 7 day 


23.5 


9.6 


Lung fibroblast none 


21.2 


15.9 
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PBMC rest 


6.1 


0.0 


LungfibyWrkP* 1 
alpha + IL-1 beta 


11.5 


313/J 
0.0 


PBMCPWM 


23.5 


9.1 


Lung fibroblast IL-4 


2.4 


0.0 


PBMC PHA-L 


3j.o 


1Z.Z 




17 fi 


S 4 


Ramos (B cell) none 


58.6 


16.7 


Lung fibroblast IL-13 


13.4 


0.0 


Ramos (B cell) 
ionomycin 


71.7 


92.7 


Lung fibroblast DPN 
gamnaa 


11.6 


3.1 


B lymphocytes PWM 


21.6 


14.8 


Dermal fibroblast 
CCD 1070 rest 


99.3 


64.6 


B lymphocytes CD40L 
andlL^ 


29.7 


23.2 


Dermal fibroblast 
CCD1070TNF alpha 


74.7 


88.9 


EOL-1 dbcAMP 


32.3 


32.8 


Dermal fibroblast 
CCD1070 IL-1 beta 


29.9 


50.0 


EOL-1 dbcAMP 
PMA/ionomycin 


10.6 


3.2 


Dermal fibroblast 
irJN gamma 


13.3 


0.0 


Dendritic cells none 


66.0 


24.5 


Dermal fibroblast 
TL-4 


12.2 


0.0 


Dendritic cells LPS 


31.4 


0.0 


Dprmal Fibroblasts 
rest 


0.0 


0.0 


Dendritic cells 
anti-CD40 


45. J 


Z.O.I 


Neutrophils 
TNFa+LPS 


0.0 


0.0 


Monocytes rest 


29.1 


0.0 


Neutrophils rest 


0.0 


0.0 


Monocytes LPS 


37.6 


18.0 


Colon 


32.3 


8.2 


Macrophages rest 


100.0 


12,9 


Lung 


3.5 


0.0 


Macrophages LPS 


28.1 


16.2 


Thymus 


12.1 


0.0 


HUVEC none 


7.9 


5.7 


Kidney 


83.5 


31.9 


HUVEC starved 


17.4 


8.4 









Table AME. general oncology screening panel v 2.4 

5 



Tissue Name 


Rel. 

Exp.(%) 
Ag5111, 
Run 

260280403 


Tissue Nme 


Rel. 

Exp.(%) 
Ag5111, 
Run 

260280403 


Colon cancer 1 


49.0 


Bladder cancer NAT 2 


0.0 


Colon cancer NAT 1 


2.5 


Bladder cancer NAT 3 


0.0 


Colon cancer 2 


11.7 


Bladder cancer NAT 4 


0.0 


Colon cancer NAT 2 


28.5 


Prostate adenocarcinoma 1 


5.0 


Colon cancer 3 


43.5 


Prostate adenocarcinoma 2 


0.0 


Colon cancer NAT 3 


53.2 


Prostate adenocarcinoma 3 


0.0 


Colon malignant cancer 4 


100,0 


Prostate adenocarcinoma 4 


0.0 


Colon normal adjacent tissue 4 


8.4 


Prostate cancer NAT 5 


0.0 


Lung cancer 1 


12.2 


Prostate adenocarcinoma 6 


0.0 


Lung NAT 1 


0.0 


Prostate adenocarcinoma 7 


0.0 
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Lung cancer 2 


72.2 


Prostate adenocarcinoma^ """* 




Lung NAT 2 


0.0 


Prostate adenocarcinoma 9 


4.0 


Squamous cell carcinoma 3 


18.8 


Prostate cancer NAT 10 


6.0 


Lung NAT 3 


0.0 


Kidney cancer 1 


7.5 


metastatic melanoma 1 


0.0 


KidneyNAT 1 


0.0 1 


Melanoma 2 


6.3 


Kidney cancer 2 


73.2 1 


Melanoma 3 


0.0 


Kidney NAT 2 


9.2 1 


metastatic melanoma 4 


0.0 


Kidney cancer 3 


6.3 


metastatic melanoma 5 


2.0 


Kidney NAT 3 


0.0 


Bladder cancer 1 


0.0 


Kidney cancer 4 


7.6 


Bladder cancer NAT 1 


0.0 


Kidney NAT 4 


84.1 


Bladder cancer 2 


0.0 







CNSjneurodegeneration_vLO Summary: AgSlll Expression of the 
CG56234-02 gene is low/undetectable in all samples on this panel (CTs>35). 
5 General_screening_paneLvl.5 Summary: Ag5 1 1 1 Highest expression of the 

CG56234-02 gene is seen in an ovarian cancer cell line (CT=30). This gene encodes a 
splice variant of PEPCK2, the rate-limiting enzyme for gluconeogenesis that has been 
shown to be regulated in response to hormones and environmental stress. In addition, to the 
ovarian cancer cell line, this gene is expressed at a moderate level in most of the cancer cell 

10 lines used in this panel. Therefore, modulation of the gene product using small molecule 
drugs may affect the growth and survival of cancer cells. Expression of this gene could 
potentially be used as a diagnostic marker of the metabolic status of cells and inhibition of 
activity of this gene prodcut might be used for therapeutic treatment of cancers. 

This gene is also moderately expressed (CT values = 34) in adult and fetal liver. 

15 Inhibition of this enzyme could potentially decrease hepatic glucose production and thus 
serve as an effective treatment for Type 2 diabetes, which is characterized by excess 
hepatic glucose production. 

Generaljscreeningjpanel_vl.6 Summary: Ag51 1 1 Three experiments with the 
same probe and primer produce results that are in excellent agreement. Highest expression 
20 is seen in an ovarian cancer cell line (CTs=31-34) and overall, expression of this gene 

appears to be more highly associated with cancer cell line samples than with normal tissue 
samples. These results are also in agreement with results in Panel 1.5. Please see that panel 
for discussion of this gene. 
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Panel 4 .ID Summary: Ag 51 1 1 This gene is exfnEfcX^t TOOlS in ?\^ld? 7 3 
range of cell across this panel (CTs=33.3-34.4), including CD4 T cells (naive and memory 
T cells), CD8 T cells, B cells and macrophages. Expression of this transcript is also found 
in dermal fibroblasts and kidney. This transcript encodes a homolog of a key enzyme in 
5 glucogenesis and therefore may be important for the metabolic status of all these cell types 
which contribute to the inflammatory response. Therefore, modulation of the activity or 
expression of this putative protein by small molecules could affect the activity of these cells 
and be useful for the treatment of autoimmune diseases such as inflammatory bowel 
diseases, rheumatoid arthritis, asthma, COPD, psoriasis and lupus. 
10 genera] oncology screening paneLv_2,4 Summary: Ag51 1 1 Low but significant 

expression is seen in a colon cancer, a kidney cancer, and a lung cancer (CTs=34-35). This 
is in agreement with the preferential expression in cancer cell lines seen in Panels 1 .5 and 
1.6. Please see Panel 1.5 for discussion of this gene in oncology. 

AN. CG56836-03: Cathepsin B. 

15 Expression of gene CG56836-03 was assessed using the primer-probe sets Ag2052 

and Ag5278, described in Tables ANA, ANB and ANC. Results of the RTQ-PCR runs are 
shown in Tables AND, ANE, ANF, ANG, ANH, ANI, ANJ and ANK. 
Table ANA. Probe Name Ag2052 

20 



Primers 




Length 


Start 
Position 


SEQID 
No 


Forward 


5 1 -gtcccaccatcaaagagatca-3 ' 


21 


414 


402 


Probe 


TET-5 ' -agaccagggctcctgtggctcct-3 
' -TAMRA 


23 


436 


403 


Reverse 


5 * -atgcagatccggtcagagat-3 1 


20 


485 


404 



Table ANB. Probe Name Ag5277 

25 



Primers 




Length 


Start 
Position 


SEQID 
No 


Forward 


5 ' -gatctgcatccacaccaat-3 • 


19 


390 


405 


Probe 


TET-5 ' -cctgctcacctgcctgctctacaagt 
-3 1 -TAMRA 


26 


441 


406 


Reverse 


5 ' -cagtcagtgttccaggagtt-3 ' 


20 


568 


407 
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Table ANC Probe Name Ag5278 



Primers 




Length 


Start 
Position 


SEQID 
No 


Forward 


5 ' -tatgaatccaatagcgaga-3 1 


19 


653 


408 


Probe 


TET-5 1 -agctttctctgtgtattcggacttcc 
-3 ' -TAMRA 


26 


715 


409 


Reverse 


5 ' -tgttggtacactcctgactt-3 ' 


20 


749 


410 



5 

Table AND. AI comprehensive panel vLO 



Tissue Name 


Rel. 

A 22052 
Run 

275804031 


Scene Nflmp 


Rel. 

Exp.(%) 
Run 

275804031 


110967 COPD-F 


10.2 


1 12427 Match Control Psoriasis-F 


15.4 


110980 COPD-F 


6.4 


112418 Psoriasis-M 


10.4 


110968 COPD-M 


12.0 


1 12723 Match Control Psoriasis-M 


5.9 


110977 COPD-M 


14.0 


112419 Psoriasis-M 


12.9 


1 10989 Emphysema-F 


15.6 


1 12424 Match Control Psbriasis-M 


4.3 


110992 Emphysema-F 


20.0 


112420 Psoriasis-M 


29.7 


110993 Emphysema-F 


13.8 


112425 Match Control Psoriasis-M 


14.8 


1 10994 Emphysema-F 


6.0 


104689 (MF) OA Bone-Backus 


29.9 


1 10995 Emphysema-F 


33.2 


104690 (MF) Adj "Normal" 
Bone-Backus 


15.4 


110996 Emphysema-F 


8.5 


104691 (MF) OA Synovium-Backus 


55.9 


110997 Asthma-M 


6.1 


104692 (BA) OA Cartilage-Backus 


27.9 


111001 Asthma-F 


6.7 


104694 (BA) OA Bone-Backus 


39.5 


111002 Asthma-F 


11.2 


104695 (BA) Adj "Normal" 
Bone-Backus 


23.0 


1 1 1003 Atopic Asthma-F 


9.7 


104696 (BA) OA Synovium-Backus 


100.0 


111004 Atopic Asthma-F 


12.2 


104700 (SS) OA Bone-Backus 


12.2 


111005 Atopic Asthma-F 


7.4 


104701 (SS) Adj "Normal'* 
Bone-Backus 


24.3 


1 1 1006 Atopic Asthma-F 


1.7 


104702 (SS) OA Synovium-Backus 


43.8 


111417 Allergy-M 


9.0 


1 17093 OA Cartilage Rep7 


18.4 


1 12347 Allergy-M 


0.0 


112672 OA Bone5 


17.3 


112349 Normal Lung-F 


0.0 


1 12673 OA Synoviumi 


6.6 


112357 Normal Lung-F 


10.7 


1 12674 OA Synovial Fluid cells5 


8.4 


112354 Normal Lung-M 


3.6 


1 17100 OA Cartilage Repl4 


8.4 


112374 Crohns-F 


10.6 


112756 OA Bone9 


13.4 
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112389 Match Control Crohns-F 


14.1 


1 12757 OA Synofibr&* U J 0 K - ' 


^11372 




9.9 


1 197SR OA Svnovial Fluid CelkQ 




1 197^9 Match Pontrnl CYnhns-F 


6.6 


117 19S RA Cartilage Rer>2 


10 s 


112725 Crohns-M 


1.3 


113492 Bone2 RA 


11.7 


112387 Match Control 


11.7 


113493 Synovium2RA 


3.6 


112378 Crohns-M 


0.0 


113494 Syn Fluid Cells RA 


6.7 


112390 Match Control 

WlUJino'-lVA 


14.5 


113499 Cartilage4RA 


6.7 


112726 Crohns-M 


11.5 


1 13500 Bone4RA 


6.3 


112731 Match Control 

V_4UIlI15-lVi 


7.5 


113501 Synovium4RA 


5.1 


112380 Ulcer Col-F 


8.7 


113502 Syn Fluid Cells4RA 


3.4 


112734 Match Control Ulcer 
uoj-r 


15.4 


113495 Cartilage3RA 


7.2 


112384 Ulcer Col-F 


25.7 


1 13496 Bone3RA 


7.0 


112737 Match Control Ulcer 
UOJ-r 


4.1 


113497 Synovium3 RA 


4.4 


112386 Ulcer Col-F 


7.1 


113498 Syn Fluid Cells3RA 


9.7 


112738 Match Control Ulcer 
LoJ-r 


13.1 


1 17106 Normal Cartilage Rep20 


8.1 


112381 Ulcer Col-M 


0.1 


1 13663 Bone3 Normal 


0.0 


1 12735 Match Control Ulcer 
Col-M 


0.4 


1 13664 Synovium3 Normal 


0.0 


112382 Ulcer Col-M 


12.9 


1 13665 Syn Fluid Cells3 Normal 


0.0 


112394 Match Control Ulcer 
Col-M 


3.3 


1 17107 Normal Cartilage Rep22 


3.2 


112383 Ulcer Col-M 


30.4 


113667 Bone4 Normal 


6.3 


112736 Match Control Ulcer 
Col-M 


11.0 


1 13668 Synovium4 Normal 


8.1 


112423 Psoriasis-F 


5.5 


1 13669 Syn Fluid Cells4 Normal 


12.9 



Table ANE. General screening panel vl.5 

5 



Tissue Name 


Rel. 

Exp.(%) 
Ag5278, 
Run 

230509757 


issue Name 


Rel. 

Exp.(%) 
Ag5278, 
Run 

230509757 


Adipose 


0.2 


Renal ca. TK-10 


6.2 


Melanoma* Hs688(A).T 


24.0 


Bladder 


5.1 


Melanoma* Hs688(B).T 


12.9 


Gastric ca. (liver met.) NCI-N87 


9.7 


Melanoma* M14 


51.8 


Gastric ca. KATO m 


5.7 


Melanoma* LOXIMVI 


26.6 


Colon ca. SW-948 


2.1 


Melanoma* SK-MEL-5 


17.0 


Colon ca. SW480 


7.0 
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Squamous cell carcinoma SCC-4 


3.2 


Colon ca.nsV4g0^me'05WSto ' 


ia rfl "7 "ft 


Testis Pool 


0.5 


Colon ca. HT29 


0.7 


Prostate ca.* (bone met) PC-3 


0.6 


Colon ca. HCT-116 


2.6 


Prostate Pool 


0.2 


Colon ca. CaCo-2 


5.3 


Placenta 


5.4 


Colon cancer tissue 


14.5 


Uterus Pool 


0.0 


Colon ca. SW1116 


2.3 


Ovarian ca. OVCAR-3 


16.3 


Colon ca.Colo-205 


7.9 


Ovarian ca. SK-OV-3 


18.7 


Colon ca. SW-48 


2.7 


Ovarian ca. OVCAR-4 


3.9 


Colon Pool 


1.8 


Ovarian ca. OVCAR-5 


5.7 


Small Intestine Pool 


0.7 


Ovarian ca. IGROV-1 


0.3 


Stomach Pool 


1.2 


Ovarian ca. OVCAR-8 


1.3 


Bone Marrow Pool 




Ovary 


3.2 


Fetal Heart 


0.5 


Breast ca. MCF-7 


3.0 


Heart Pool 


1.2 


Breast ca. MDA-MB-231 


4.1 


Lymph Node Pool 


2.9 


Breast ca. BT 549 


100.0 


Fetal Skeletal Muscle 


0.3 


Breast ca. T47D 


2.0 


Skeletal Muscle Pool 


1.0 


Breast ca. MDA-N 


1.6 


Spleen Pool 


2.1 


Breast Pool 


2.0 


Thymus Pool 


1.4 


Trachea 


2.3 


CNS cancer (glio/astro) U87-MG 


8.1 


Lung 


0.5 


CNS cancer (glio/astro) U-118-MG 


12.3 


Fetal Lung 


2.2 


CNS cancer (neuro;met) SK-N-AS 


2.0 


Lung ca.NCI-N417 


0.1 


CNS cancer (astro) SF-539 


3.4 


Lung ca. LX-1 


6.1 


CNS cancer (astro) SNB-75 


27.4 


Lungca. NCI-H146 


0.4 


CNS cancer (glio) SNB-19 


2.4 


Lung ca. SHP-77 


1.8 


CNS cancer (glio) SF-295 


26.8 


Lung ca. A549 


4.1 


Brain (Amygdala) Pool 


2.1 


Lungca. NCI-H526 


0.1 


Brain (cerebellum) 


6.9 


Lung ca. NCI-H23 


3.0 


Brain (fetal) 


1.2 


Lungca. NCI-H460 


2.6 


Brain (Hippocampus) Pool 


1.9 


Lungca.HOP-62 


4.0 


Cerebral Cortex Pool 


3.8 


Lungca.NCI-H522 


1.0 


Brain (Substantia nigra) Pool 


2.6 


Liver 


1.4 


Brain (Thalamus) Pool 


2.8 


Fetal Liver 


10.4 


Brain (whole) 


5.3 


Liver ca. HepG2 


8.3 


Spinal Cord Pool 


2.4 


Kidney Pool 


0.0 


Adrenal Gland 


3.2 


Fetal Kidney 


0.7 


Pituitary gland Pool 


0.6 


Renal ca. 786-0 


5.3 


Salivary Gland 


2.5 


Renal ca. A498 


4.0 


Thyroid (female) 


25.3 


Renal ca. ACHN 


3.0 


Pancreatic ca. CAPAN2 


5.7 


Renal ca. UO-31 


15.2 


Pancreas Pool 


3.0 
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Table ANF. HASS Panel vl.O 



Tissue Name 


Rel. 
Run 

247736616 


ReL 

Ag2052, 
Run 

248455625 


Tissue Name 


ReL 

Ag2052, 
Run 

247736616 


Rel. 

tLxp«V /o) 
Ag2052, 
Run 

248455625 


MCF-7 CI 


12.6 


7.1 


U87-MGF1 (B) 


40.3 


22.4 


MCF-7 C2 


12.7 


8.6 


U87-MGF2 


11.1 


6.7 


MCF-7 C3 


10.2 


5.6 


U87-MGF3 


12.2 


8.0 


MCF-7 C4 


16.2 


19.5 


U87-MGF4 


27.0 


17.8 


MCF-7 C5 


13.2 


11.0 


U87-MGF5 


59.0 


38.2 


MCF-7 C6 


13.2 


14.6 


U87-MGF6 


61.1 


44.4 


MCF-7 C7 


12.7 


10.4 


U87-MGF7 


72.7 


50.7 


MCF-7 C9 


9.7 


12.9 


U87-MGF8 


75.3 


54.7 


MCF-7 CIO 


15.8 


17.1 


U87-MGF9 


29.9 


28.1 


MCF-7 Cll 


2.5 


L8 


U87-MGF10 


65.1 


50.0 


MCF-7 C12 


9.9 


8.0 


U87-MGF11 


58.2 


48.3 


MCF-7 C13 


12.5 j 


17.1 


U87-MGF12 


47.0 


42.6 


MCF-7 C15 


5.6 


6.5 


U87-MGF13 


95.3 


77.9 


MCF-7 C16 


14.0 


21.5 


U87-MGF14 


96.6 


80.1 


MCF-7 C17 


10.2 


6.9 


U87-MGF15 


64.6 


54.7 


T24D1 


25.0 


14.4 


U87-MGF16 


51.8 


47.6 


T24 D2 


33.0 


42.0 


U87-MGF17 


62.0 


49.0 


T24D3 


29.3 


19.1 


LnCAPAl 


9.4 


6.0 


T24D4 


39.8 


30.6 


LnCAP A2 


8.1 


5.5 


T24D5 


28.5 


19.5 


LnCAP A3 


6.3 j 


3.4 


T24D6 


32.8 


27.2 


LnCAP A4 


10.4 


6.9 


T24D7 


18.3 


25.9 


LnCAP A5 


10.0 


6.0 


T24D9 


12.1 


8.5 


LnCAP A6 


10.0 


6.3 


T24D10 


23.5 


19.2 


LnCAP A7 


9.2 


6.6 


T24 Dll 


13.2 


11.7 


LnCAP A8 


11.5 


8.8 


T24 D12 


24.0 


19.2 


LnCAP A9 


10.8 


7.2 


T24D13 


8.5 


5.8 


LnCAP A10 


11.0 


8.0 


T24D15 


10.7 


8.0 


LnCAP All 


15.7 


10.7 


T24D16 


6.6 


4.7 


LnCAP A12 


3.5 


2.3 


T24D17 


12.0 


7.4 


LnCAP A13 


5.7 


3.3 


CAPaNBl 


64.6 


52.1 


LnCAP A14 


3.3 


1.7 


CAPaNB2 


46.3 


33.2 


LnCAP A15 


2.5 


1.3 


CAPaN B3 


13.0 


10.7 


LnCAP A16 


12.5 


8.6 


CAPaN B4 


39.8 


30.4 


LnCAP A17 


12.2 


2.5 


CAPaN B5 


39.5 


28.7 


Primary Astrocytes 


47.3 


27.9 
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77 ^ 
Z/.j 


7^ 7 
Zj./ 


i p p x 

Primary Renal tt 11 

Proximal Tubule 

Epithelial cell A2 


•* *u* ftjt icn , 

1 AA A 

100.0 


' ~2r Jfc. ^3r V* „ 

100.0 


CAPaNB7 


30.1 


31.2 


Primary melanocytes 
A5 


40.1 


21.8 


wvraiN DO 


71 7 


ZD. 5 


126443 - 341 medullo 


0.7 


0.4 


CAPaNB9 


38.7 


50.0 


126444- 487 medullo 


2.2 


1.8 


CAPaNBlO 


57.4 


51.4 


126445 - 425 medullo 


1.6 


1.0 


CAPaNBll 


45.1 


28.5 


126446 - 690 medullo 


4.4 


2.6 


CAPaNB12 


31.4 


22.7 


126447-54 adult 


33.4 


22.2 


CAPaNB13 


38.7 


29.7 


126448 -245 adult 
glioma 


9.4 


6.3 


CAPaNBH 


29.9 


22.1 


126449 - 317 adult 
glioma 


10.4 


6.0 


CAPaNBIS 


32.8 


20.7 


126450-212 glioma 


41.5 


22.8 


CAPaNB16 


29.7 


16.4 


126451 -456 glioma 


17.4 


11.3 


CAPaNB17 


42.3 


24.3 









Table ANG. Panel 1.3D 



5 



Tissue Name 


Rel. 
Exp.(% 
Ag2052, 
Run 

166004256 


Tissue Name 


Rel. 

Exp.(%) 
Ag2052, 
Run 

166004256 


Liver adenocarcinoma 


21.8 


Kidney (fetal) 


19.2 


Pancreas 


4.2 


Renal ca. 786-0 


8.4 


Pancreatic ca. CAPAN 2 


24.5 


Renal ca. A498 


26.4 


Adrenal gland 


11.7 


Renal ca.RXF393 


34.4 


Thyroid 


37.6 


Renal ca. ACHN 


9.3 


Salivary gland 


25.3 


Renal ca. UO-31 


33.7 


Pituitary gland 


13.8 


Renal ca. TK-10 


2.8 


Brain (fetal) 


11.7 


Liver 


14.0 


Brain (whole) 


51.4 


Liver (fetal) 


16.2 


Brain (amygdala) 


29.5 


Liver ca. (hepatoblast) HepG2 


33.9 


Brain (cerebellum) 


24.3 


Lung 


22.8 


Brain (hippocampus) 


24.5 


Lung (fetal) 


10.7 


Brain (substantia nigra) 


17.8 


Lung ca. (small cell) LX-1 


25.2 


Brain (thalamus) 


27.5 


Lung ca. (small cell) NCI-H69 


2.1 


Cerebral Cortex 


45.4 


Lung ca. (s.cell var.) SHP-77 


6.9 


Spinal cord ! 


30.4 


Lung ca. (large cell)NCI-H460 


11 


glio/astroU87-MG 


42.6 


Lung ca. (non-sm. cell) A549 


4.4 


gho/astroU-118-MG 


23.5 


Lung ca. (non-s.cell) NCI-H23 


4.4 
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astrocytoma SW 1783 


24.3 






neuro*; met SK-N-AS 


5.4 


Lung ca. (non-s.cl) NCI-H522 


3.4 


astrocytoma SF-539 


43.8 


Lung ca. (squam.) SW 900 


18.4 


astrocytoma SNB-75 


21.9 


Lung ca. (squam.) NCI-H596 


1.9 


glioma SNB-19 


20.7 


Mammary gland 


15.5 


glioma U251 


43.2 


Breast ca* (pl.ef) MCF-7 


10.7 


glioma SF-295 


25.5 


Breast ca* (pl.ef) MDA-MB-231 


13.2 


Heart (fetal) 


15.2 


Breast ca * (pl.ef) T47D 


6.0 


Heart 


13.7 


Breast ca. BT-549 


100.0 


Skeletal muscle (fetal) 


8.2 


Breast ca. MDA-N 


3.7 


Skeletal muscle 


11.8 


Ovary 


23.5 


Bone marrow 


19.5 


Ovarian ca. OVCAR-3 


14.1 


Thymus 


7.7 


Ovarian ca. OVCAR-4 


20.7 


Spleen 


34.6 


Ovarian ca. OVCAR-5 


23.5 


Lymph node 


17.4 


Ovarian ca. OVCAR-8 


7.8 


Colorectal 


12.5 


Ovarian ca. IGROV-1 


5.1 


Stomach 


8.0 


Ovarian ca.* (ascites) SK-OV-3 


27.9 


Small intestine 


12.2 


Uterus 


11.0 


Colon ca. SW480 


9.7 


Placenta 


40.3 


Colon ca.* SW620(SW480 met) 


5.9 


Prostate 


8.0 


Colon ca. HT29 


1.2 


Prostate ca * (bone met)PC-3 


8.4 


Colon ca.HCT-1 16 


4.8 


Testis 


4.3 


Colon ca. CaCo-2 


15.7 


Melanoma Hs688(A).T 


22.7 


Colon ca. tissue(OD03866) 


62.4 


Melanoma* (met) Hs688(B).T 


21.8 


Colon ca. HCC-2998 


12.9 


Melanoma UACC-62 


23.0 j 


Gastric ca.* (liver met) NCI-N87 


21.9 


Melanoma M14 


43.2 


Bladder 


11.4 


Melanoma LOX MVI 


11.2 


Trachea 


13.1 


Melanoma* (met) SK-MEL-5 


22.8 


Kidney 


31.0 


Adipose 


12.8 ! 



Table ANH. Panel 2.2 



Tissue Name 


ReL 

Exp. %) 
Ag2052, 
Run 

174244470 


Tissue Name 


Rel. 

Exp.(%) 
Ag2052, 
Run 

174244470 


Normal Colon 


3.3 


Kidney Margin (OD04348) 


13.1 


Colon cancer (OD06064) 


23.3 


Kidney malignant cancer 
(OD06204B) 


1.0 


Colon Margin (OD06064) 


3.6 


Kidney normal adjacent tissue 
(OD06204E) 


9.5 


Colon cancer (OD06159) 


1.5 


Kidney Cancer (OD04450-01) 


22.2 
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Colon Margin (OD06159) 


3.6 


Kidney IMAM&Mf^ 


£5:$ X :S y . 


Colon cancer (OD06297-04) 


1.3 


Kidnev Cancer 8120613 


0 fi 


Colon Margin (OD06297-05) 


4.7 


Kidnev Marein 8120614 


0 O 


CC Gr.2 ascend colon (OD03921) 


1.5 


Kidney Cancer 9010320 


10 7 
±\j. i 


CC Margin (OD03921) 


2.6 


Kidnev Marein 9010321 

l>JUIlw V JLTAIU gill ?\*±\J*J4mA\ 


u.v) 


Colon cancer metastasis 
(OD06104) 


6.7 


Kidney Cancer 8120607 


9.7 


Lung Margin (OD06104) 


6.0 


Kidney Margin 8120608 


11.4 


Colon mets to lung (OD04451-01) 


12.8 


Normal Uterus 


3.1 


Lung Margin (OD04451-02) 


6.0 


Uterine Cancer 06401 1 


3.5 


Normal Prostate 


2.3 


Normal Thyroid 


7.2 


Prostate Cancer (ODQ4410) 


0.7 


Thyroid Cancer 064010 


44.8 


Prostate Marein (OD0441O 


1.2 




inn a 


Normal Ovary 


6.1 


Thyroid Margin A302153 


7.6 


Uvanan cancer (OD06283-03) 


4.1 


Normal Breast 


2.2 


Uvanan Margin (OD06283-07) 


2.0 


Breast Cancer (OD04566) 


2.5 


Ovarian Cancer 064008 


9.2 


Breast Cancer 1024 


6.3 


Ovarian cancer (OD06145) 


8.9 


Breast Cancer (OD04590-01) 


8.5 


Ovarian Margin (OD06145) 


3.8 


Breast Cancer Mets 
(OD04590-03) 


4.4 


Ovarian cancer (OD06455-03) 


6.1 


Breast Cancer Metastasis 
(OD04655-05) 


3.3 


Ovanan Margin (OD06455-07) 


1.0 


Breast Cancer 064006 


4.9 


Normal Lung 


4.9 


Breast Cancer 9100266 


2.7 


Invasive poor diff. lung adeno 
(UDU4945-01 


2.9 


Breast Margin 9100265 


1.7 


Lung Margin (ODO4945-03) 


3.2 


Breast Cancer A209073 


1.5 


Lung Malignant Cancer 
(0003126^ 


11.1 


Breast Margin A2090734 


2.3 


Lune Margin fOD0312M 


5 1 

<J,X 


orcasi (✓dncer \\JXJ\J\j\JoJ) 




Lung Cancer (OD05014A) 


19.6 


(OD06083) 


5.6 


Lung Margin (OD05014B) 


15.3 


Normal Liver 


6.9 


Lung cancer (OD06081) 


3.4 


Liver Cancer 1026 


8.0 


Lung Margin (OD06081) 


1.3 


Liver Cancer 1025 


22.2 


Lung Cancer (OD04237-01) 


4.6 


Liver Cancer 6004-T 


13.8 


Lune Marein (OD04237-02^ 


11.1 


T ivprTiosiip fifYld-N 

-LilVd 1 JooUC UULrt 1 1 


d 1 


Ocular Melanoma Metastasis 


3.5 


Liver Cancer 6005-T 


21.5 


Ocular Melanoma Margin (Liver) 


9.8 


Liver Tissue 6005-N 


51.1 


Melanoma Metastasis 


5.4 


Liver Cancer 064003 


13.6 


Melanoma Margin (Lung) 


5.1 


Normal Bladder 


2.8 


Normal Kidney 


3.3 


Bladder Cancer 1023 


4.8 


Kidney Ca f Nuclear grade 2 
(OD04338) 


5.0 


Bladder Cancer A302173 


6.1 


Kidney Margin (OD04338) 


10.6 


Normal Stomach 


5.3 
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jviuney i^a lNuciear graae i/z 
(OD04339) 


15.0 


s y UHOTF 

H Hm!* il it "o.»r %wd< IIm, 

Gastric Cancer 9060397 


' JL, ».J» j l . 

6.3 


4 


Kidney Margin (OD04339) 


11.3 


Stomach Margin 9060396 


5.0 




Kidney Ca, Clear cell type 
(OD04340) 


4.2 


Gastric Cancer 9060395 


4.6 




Kidney Margin (OD04340) 


7.2 


Stomach Margin 9060394 


7.7 




Kidney Ca, Nuclear grade 3 
(OD04348) 


3.1 


Gastric Cancer 064005 


3.8 





Table AM. Panel 4.1D 

5 



Tissue Name 


Rel. 
Exp.(% 
Ag5278, 
Run 

230472911 


Tissue Name 


ReL 

Exp.(%) 
Ag5278, 
Run 


Secondary Thl act 


3.4 


HUVEC TL-lheta 




Secondary Th2 act 


3.3 






Secondary Trl act 


1.2 


HUVEC TNF alpha + IFN gamma 


1A 


Secondary Thl rest 


0.4 


HUYEC TNF alpha + IL4 


2.1 


Secondary Th2 rest 


0.0 


HUVEC IL-11 


3.6 1 


Secondary Trl rest 


0.0 


Lung Microvascular EC none 


27.7 


Primary Thl act 


0.0 


Lung Microvascular EC TNFalpha 
+ IL-lbeta 




rnmary lnZact 


1.1 


Microvascular Dermal EC none 


4.2 


Primary Trl act 


1.4 


Microsvasular Dermal EC 
TNFalpha + IL-lbeta 


3.0 


Primary Thl rest 


0.5 


Bronchial epithelium TNFalpha + 
DLlbeta 


9.1 


Primary Th2 rest 


0.5 


Small airway epithelium none 


22.1 


Primary Trl rest 


0.9 


Small airway epithelium TNFalpha 
+ IL-lbeta 


33.9 


CD45RA CD4 lymphocyte act 


5.0 


Coronery artery SMC rest 


6.2 


CD45RO CD4 lymphocyte act 


1.6 


Coronery artery SMC TNFalpha + 
IL-lbeta 


11.3 


CD8 lymphocyte act 


0.4 


Astrocytes rest 


2.3 


Secondary CD8 lymphocyte rest 


1.3 


Astrocytes TNFalpha + IL-lbeta 


3.1 


Secondary CD8 lymphocyte act 


0.0 


KU-812 (Basophil) rest 


1.9 


CD4 lymphocyte none 


0.0 


KU-812 (Basophil) 
PMA/ionomycin 


10.9 


2ry Thl/Th2/Trl_anti-CD95 
CH11 


0.0 


CCD1106 (Keratinocytes) none 


5.8 


LAK cells rest 


18.6 


CCD1106 (Keratinocytes) 
TNFalpha + IL-lbeta 


4.8 


LAK cells IL-2 


0.6 


Liver cirrhosis 


1.9 
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LAK cells IL-2+EL-12 


0.0 






LAK cells IL-2+IFN gamma 


0.7 


NC1-H292 IL-4 


8.4 


LAK cells IL-2+ IL-18 


0.9 




IX) 


LAK cells PMA/ionomycin 


62.4 


NCI-H292 TL-13 


J.O 


NK Cells 1L-2 rest 


1.0 


NPT-H707 TFM aamrna 


J.O 


Two Way MLR 3 day 


9.4 




0 1 

y.L 


Two Way MLR 5 day 


3 9 


WPAPP TKJT7 alnhn 4. TT -1 K*»ta 


Z5.3 


Two Way MLR 7 day 


2.3 


T iin ct ■fihmMact "nrvn a 
l_*UUg llUJUUIdiL nunc 




PBMC rest 


0.6 


Lung fibroblast TNF alpha + JX-1 

ucia 


12.2 


PBMC PWM 


1.1 


Lung fibroblast 7JL-4 


3.9 


Pux/Trf" 1 du a t 
fiJML rxlA-JL 


2.2 


Lung fibroblast IL-9 


11.8 


Ramos (B cell) none 


0.0 


Lung fibroblast JJL-13 


5.4 


Ramos (B cell) ionomycin 


0.0 


Lung fibroblast IFN gamma 


19.5 


B lymphocytes PWM 


0.0 


Dermal fibroblast CCD1070 rest 


32.1 


B lymphocytes CD40L and IL-4 


1.4 


Dermal fibroblast CCD1070 TNF 
alpha 


66.0 


EOL-1 dbcAMP 


1.4 


Dermal fibroblast CCD1070 IL-1 
beta 


21.8 


EOf -1 dhrAMP 

PMA/ionomycin 


1.4 


Dermal fibroblast IFN gamma 


42.3 


Dendritic cells none 


100.0 


Dermal fibroblast IL-4 


45 1 


Dendritic cells LPS 


34.9 


jL/ciixuu j/ioruoiasis rest 


10./ 


Dendritic cells anti-CD40 


44.8 


Neutrophils TNFa+LPS 


0.0 


Monocytes rest 


1.4 


Neutrophils rest 


0.6 


Monocytes LPS 


19.9 


Colon 


0.0 


Macrophages rest 


12.5 


Lung 


1.4 


Macrophages LPS 


11.2 


Thymus 


0.0 


HUVEC none 


5.9 


Kidney 


12.8 


HUVEC starved 


11.7 







Table AN.T. Panel 4D 



Tissue Name 


Rel. 
Exp.O 
Ag2052, 
Run 

161706487 


Tissue Name 


Rel. 

Exp.(%) 
Ag2052, 
Run 

161706487 


Secondary Thl act 


2.6 


HUVEC IL-lbeta 


2.1 


Secondary Th2 act 


1.7 


HUVEC IFN gamma 


5.2 


Secondary Trl act 


1.9 


HUVEC TNF alpha + IFN gamma 


5.7 


Secondary Thl rest 


0.3 


HUVEC TNF alpha + TJL4 


4.5 


Secondary Th2 rest 


0.5 


HUVEC IL-1 1 


2.6 


Secondary Trl rest 


0.6 


Lung Microvascular EC none 


9.9 
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Primary Thl act 


1.4 


Lung Microvafeufb-'EwlWdlffife 
+ IL-lbeta 




r nmary i nz act 


0.7 


Microvascular Dermal EC none 


16.6 


Primary Trl act . 


1.2 


Microsvasular Dermal EC 
TNFalpha + IL-lbeta 


9.2 


Primary Thl rest 


2.2 


Bronchial epithelium TNFalpha + 

JJU 1 DC La 


3.1 


Primary Th2 rest 


1 4 


oman airway epiuieiium none 


11 C 


Primary Trl rest 


0.2 


ojiiajj dirway epuiieiiurn liNr'ajpna 
+ EL-lbeta 


46.0 


CD45RA CD4 lvmnhocvtp act 


4 ? 


i^oronery artery oivn_ rest 


C A 

5.4 


CD45RO CD4 lymphocyte act 


1.4 


Coronery artery SMC TNFalpha + 
UL-ioeta 


4.3 


CD8 lymphocyte act 


0.3 


Astrocytes rest 


2.2 


Secondary CD8 lymphocyte rest 


1.4 


Astrocytes TNFalpha + IL-lbeta 


2.0 


Secondary CD8 lymphocyte act 


0.4 


KU-812 (Basophil) rest 


1.5 


CD4 lymphocyte none 


0.4 


KU-812 (Basophil) 
PMA/ionomycin 


11.0 


j lull lilZJ 111 allll^K^LJy J 

CH11 


0.8 


CCD1 106 (Keratinocytes) none 


3.1 


LAK cells rest 


43.2 


CCD1106 (Keratinocvtesi 
TNFalpha + IL-lbeta 


0.8 


LAK cells JL-2 


0.8 


Liver cirrhosis 


1.5 


LAK cells IL-2+1L-12 


1.8 


Lupus kidney 


0.7 


LAK cells IL-2+IFN gamma 


3.2 


NCI-H292 none 


5.8 


LAK cells IL-2+ IL-1 8 


2.1 


NCI-H292 IL-4 


5.5 


LAK cells PMA/ionomycin 


26.2 


NCI-H292 IL-9 


7.4 


NK Cells IL-2 rest 


0.3 


NCI-H292 EL-13 


1 1 


Two Way MLR 3 day 


9.2 


NCI-H292 IFN gamma 


3.3 


Two Way MLR 5 day 


9.3 


HPAEC none 


5.6 


Two Way MLR 7 day 


2.0 


HPAEC TNF alpha + IL-1 beta 


10.7 


PBMCrest 


1.0 


Lung fibroblast none 


6.3 


PBMC PWM 


5.3 


Lung fibroblast TNF alpha + JL-1 
beta 


6.3 


PBMC PHA-L 


5.0 


Lung fibroblast IL-4 


10.4 


Ramni fR rein nroie 


0 0 


L«ung iiDroDiast Uu-y 


O 1 

8.1 


Ramos ^R celH lonomvrin 


o n 

v.U 


r Ti-rt rr ftKrnklart TT 1 'J 

bung nDroDiasi iLi-i d 


5.6 1 


B lvmohocvtes PWM 


2 2 


1 linn nKmKlOCt 1 KM rrlm-mn 

L^UDg ilDrODlaSl LTl\ gamma 


1 C A 

IDA 


B lymphocytes CD40L and IL-4 


1.2 


Dermal fibroblast CCD 1070 rest 


15.5 


EOL-1 dbcAMP 


0.7 


Dermal fibroblast CCD1070 TNF 
alpha 


18.9 


EOL-1 dbcAMP 
PMA/ionomycin 




£>ermal fibroblast CCD1070 IL-1 
>eta 


11.1 


Dendritic cells none 


66.9 


Dermal fibroblast IFN gamma 


19.6 


Dendritic cells LPS 


37.6 


Dermal fibroblast DL-4 


21.2 


Dendritic cells anti-CD40 


77.9 


DBD Colitis 2 


).2 
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Monocytes rest 


5.1 


ffiDCrohn'PCT/UBOa/ 


a±37:* 


Monocytes LPS 


17.2 


Colon 


3.9 


Macrophages rest 


100.0 


Lung 


19.8 | 


Macrophages LPS 


40.9 


Thymus 


12.9 


HUVEC none 


5.3 


Kidney 


2.4 


HUVEC starved 


10.6 







Table ANK. Panel 5 Islet 



Tissue Name 


tvei. 

Exp.(%) 
Ag2052, 
un 

279370795 


Tissue Name 


xvei. 

Exp.(%) 
Ag2052, 
Run 

279370795 


97457J > atient-02go_adipose 


15.6 


94709 JDonor 2 AM - A_adipose 


24.7 


97476_Patient-07sk_skeletal 
muscle 


n n 

U.u 


y*t i iu_i/onor l Aivi - jD_aaipose 


OA 1 


97477_Patient-07ut_uterus 


22.1 


94711_Donor 2 AM - C_adipose 


14.7 


97478J > atient-07pU>lacenta 


13.1 


94712_Donor 2 AD - A_adipose 


64.2 


99 167 JBayer Patient 1 


17.6 


94713 _Ponor 2 AD - B__adipose 


89.5 


97482_Patient-08uUiterus 


15.3 


94714_J)onor 2 AD - C_adipose 


66.4 


97483_Patient-08pl_placenta 


11.6 


94742_Donor 3 U - AJtfesenchymal 
Stem Cells 


17.3 . 


97486JPatient-09sk_skeletal 
muscle 


4.8 


94743JDonor 3 U - B_Mesenchymal 
Stem Cells 


23.2 


97487_Patient-09ut_uterus 


15.5 


94730_Donor 3 AM - A_adipose 


54.0 


97488JPatient-09pLplacenta 


7.9 


94731J>onor 3 AM - B_adipose 


76.3 


97492_Patient- 10ut_uterus 


14.5 


94732 J>onor 3 AM - C_adipose 


59.9 


97493 JPatient-lOpLplacenta 


23.8 


94733 JOonor 3 AD - A^adipose 


100.0 


97495 JPatient-1 lgo_adipose 


11.9 


94734_Donor 3 AD - B_adipose 


92.0 


97496_Patient-l lsk^skeletal 
muscle 


3.2 


94735JDonor 3 AD - C_adipose 


32.1 


97497_Patient-l lut_uterus 


36.9 


77 1 38JLi ver_HepG2untreated 


62.9 


97498 patient- 1 lpLplacenta 


7.0 


73556_Heart_Cardiac stromal cells 
(primary) 


0.3 


97500_Patient-12go_adipose 


17.2 


81735_Small Intestine 


10.9 


9750 LPatient-12sK.skeletal 
muscle 


8.4 


72409 JQdney_ProximaI Convoluted 
Tubule 


23.7 


97502J?atient-12uUiterus 


25.2 


82685_Small intestine J)uodenum 


9.3 


97503 JPatient-12pl_placenta 


23.8 


90650_AdrenaLAdrenocortical 
adenoma 


8.4 


94721_Donor2U- 
A31esenchymal Stem Cells 


61.6 


72410JKadneyJKRCE 


40.1 


94722JDonar2U- 
B_Mesenchymal Stem Cells 


45.1 


72411JCidney_HRE 


13.5 
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94723_X>onor2U- 


53.2 


73139_Uterut15ter 




<«lL- U«>j( m 

61.1 


C_Mesenchymal Stem Cells 


muscle cells 





ALcomprehensive paneLvl.O Summary: Ag2052 Highest expression of this 
gene is detected in synovium from an orthoarthritis (OA) patient (CT=203). High levels of 
5 expression of this gene are detected in samples derived from normal and orthoarthitis/ 
rheumatoid arthritis bone and adjacent bone, cartilage, synovium and synovial fluid 
samples, from normal lung, COPD lung, emphysema, atopic asthma, asthma, allergy, 
Crohn's disease (normal matched control and diseased), ulcerative colitis(normal matched 
control and diseased), and psoriasis (normal matched control and diseased). Therefore, 

10 therapeutic modulation of this gene product may ameliorate symptoms/conditions 
associated with autoimmune and inflammatory disorders including psoriasis, allergy, 
asthma, inflammatory bowel disease, rheumatoid arthritis and osteoarthritis. 

CNS_neurodegeneration_vl.O Summary: Ag5277/Ag5278 Expression of this 
gene is low/undetectable (CTs > 35) across all of the samples on this panel. 

15 General jscreeningj>aneLvl.5 Summary: Ag5278 Highest expression of this 
gene is detected in breast cancer BT-549 cell line (CT=29). Moderate levels of expression 
of this gene is also seen in cluster of cancer cell lines derived from pancreatic, gastric, 
colon, lung, liver, renal, breast, ovarian, melanoma and brain cancers. In addition, moderate 
to low levels of expression of this gene is also seen in all the regions of brain, in tissues 

20 with metabolic/endocrine functions such as pancreas, adrenal gland, thyroid, fetal liver and 
colon. Please see panel 1.3D for further discussion of this gene. 

Ag5277 Expression of this gene is low/undetectable (CTs > 35) across all of the 
samples on this panel. 

HASS Panel vl.O Summary: Ag2052 Two experiments with same probe and 

25 primer sets are in exceDent agreement. This gene shows wide spread expression in this 

panel, with highest expression in primary renal proximal tubular epithelial cells cultured in 
vitro (CTs=20-22). The expression of this gene is also higher in the glioblastoma type of 
brain cancer compared to the medulloblastoma suggesting that it may play a role in 
glioblastoma development than medulloblastomas. Expression is also induced in the 

30 U87-MG( cells when they are deprived of nutrients, oxygen and exposed to an acidic pH 
than in the control population (comparing the control U87-MG F4 with U87-MG F5, F7, 
F10). This suggests that the serum-starved, hypoxic and acidotic regions of brain cancers 
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may express this gene at a higher level and that this may 88 ISeS'aS UfibS^Q^Sblb 3 7 3 
regions. 

Panel 1.3D Summary: Ag2052 This gene shows a widespread expression in this 
panel. Highest expression of this gene is detected in breast cancer BT-549 cell line 
5 (CT=24.9). High levels of expression of this gene is also seen in cluster of cancer cell lines 
derived from pancreatic, gastric, colon, lung, liver, renal, breast, ovarian, prostate, 
melanoma and brain cancers. Thus, expression of this gene could be used as a marker to 
detect the presence of these cancers. Furthermore, therapeutic modulation of the expression 
or function of this gene may be effective in the treatment of pancreatic, gastric, colon, lung, 
10 liver, renal, breast, ovarian, prostate, melanoma and brain cancers. 

Among tissues with metabolic or endocrine function, this gene is expressed at high 
levels in pancreas, adipose, adrenal gland, thyroid, pituitary gland, skeletal muscle, heart, 
liver and the gastrointestinal tract. Therefore, therapeutic modulation of the activity of this 
gene may prove useful in the treatment of endocrine/metabolically related diseases, such as 
15 obesity and diabetes. 

In addition, this gene is expressed at high levels in all regions of the central nervous 
system examined, including amygdala, hippocampus, substantia nigra, thalamus, 
cerebellum, cerebral cortex, and spinal cord. Therefore, therapeutic modulation of this gene 
product may be useful in the treatment of central nervous system disorders such as 
20 Alzheimer's disease, Parkinson's disease, epilepsy, multiple sclerosis, schizophrenia and 
depression. 

Panel 2.2 Summary: Ag2052 Highest expression of this gene is detected in 
thyroid cancer (CT=23.9). High to moderate levels of expression of this gene is also seen in 
normal and cancer samples derived from melanoma, colon, gastric, bladder, liver, breast, 

25 thyroid, uterine, kidney, lung, ovarian and prostate cancers. Interestingly, higher levels of 
expression of this gene is associated with kidney and thyroid cancers as compared to 
corresponding normal tissue. Therefore, expression of this gene may bay used as diagnostic 
marker to detect the presence of these cancers. Furthermore, therapeutic modulation of this 
gene may be useful in the treatment of melanoma, colon, gastric, bladder, liver, breast, 

30 thyroid, uterine, kidney, lung, ovarian and prostate cancers. 

Panel 4.1D Summary: Ag5278 Highest levels of expression of this gene is 
detected in resting dendritic cells (CT=32). Moderate to low levels of expression of this 
gene is also seen in activated dendrict cells, PMA/ionomycin stimulated LAK cells, LPS 
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activated macrophage, lung microvascular endothelial cd^JQ^e^M^^^^^Jh^ff' 3 
airway epithelium, and derma] fibroblasts. Therefore, therapeutic modulation of this gene 
or its protein product may alter the functions associated with these cell types and would be 
beneficial in the treatment of autoimmune and inflammatory diseases such as asthma, 
5 allergies, inflammatory bowel disease, lupus erythematosus, psoriasis, rheumatoid arthritis, 
and osteoarthritis. 

Ag5277 Expression of this gene is low/undetectable (CTs > 35) across all of the 
samples on this panel. 

Panel 4D Summary: Ag2052 Highest expression of this gene is detected in resting 

10 macrophage (CT=21). This gene is expressed at high to moderate levels in a wide range of 
cell types of significance in the immune response in health and disease. These cells include 
members of the T-cell, B-cell, dendritic cells, endothelial cell, macrophage/monocyte, and 
peripheral blood mononuclear cell family, as well as epithelial and fibroblast cell types 
from lung and skin, and normal tissues represented by colon, lung, thymus and kidney. This 

15 ubiquitous pattern of expression suggests that this gene product may be involved in 
homeostatic processes for these and other cell types and tissues. This pattern is in 
agreement with the expression profile in General_screening_panel_vl.3 and also suggests a 
role for the gene product in cell survival and proliferation. Therefore, modulation of the 
gene product with a functional therapeutic may lead to the alteration of functions associated 

20 with these cell types and lead to improvement of the symptoms of patients suffering from 
autoimmune and inflammatory diseases such as asthma, allergies, inflammatory bowel 
disease, lupus erythematosus, psoriasis, rheumatoid arthritis, and osteoarthritis. 

Panel 5 Islet Summary: Ag2052 Highest expression of this gene is detected in a 
differentiated adipose tissue (CT=24.4). Moderate to high levels of expression is seen in 

25 placenta, uterus, adipose, skeletal muscle, small intestine, heart and kidney. This gene 

shows a ubiquitous expression which correlates to the expression in panel 1.3D. Please see 
panel 1.3D for further discussion of this gene. 

AO- CG56836-04: Cathepsin B. 

Expression of gene CG56836-04 was assessed using the primer-probe set Ag5264, 
30 described in Table AO A. Results of the RTQ-PCR runs are shown in Tables AOB, AOC 
and AOD. 
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Table AOA. Probe Name Ag5264 



Primers 




Length 


Start 
Position 


SEQID 
No 


Forward 


5 1 -tcctgctgggtttctggt-3 ' 


18 


455 


411 


Probe 


TET-5 ' -ccgtactccatccctccctgtgagc- 
3 ■ -TAMRA 


25 


503 


412 


Reverse 


5 ' -tgtttgtaggtcgggctgta-3 ■ 


20 


605 


413 



Table AOS. CNS neurodegeneration vl.O 



Tissue Name 


Rel. 

Exp.(%) 
Ag5264, 

Run 
rVuU 

230512807 


issue Name 


Rel. 

Exp.(%) 
Ag5264, 
Run 

230512807 


AD 1 Hippo 


10.2 


Control (Path) 3 Temporal Ctx 


3.6 


AD 2 Hippo 


32.5 


Control (Path) 4 Temporal Ctx 


18.4 


AD 3 Hippo 


9.3 


AD 1 Occipital Ctx 


14.7 


AD 4 Hippo 


3.8 


AD 2 Occipital Ctx (Missing) 


0.0 


AD 5 hippo 


94.0 


AD 3 Occipital Ctx 


7.3 


AD 6 Hippo 


66.9 


AD 4 Occipital Ctx 


13.4 


Control 2 Hippo 


25.0 


AD 5 Occipital Ctx 


15.3 


Control 4 Hippo 


13.0 


AD 6 Occipital Ctx 


39.0 


Control (Path) 3 Hippo 


4.0 


Control 1 Occipital Ctx 


5.9 


AD 1 Temporal Ctx 


9.8 


Control 2 Occipital Ctx 


53.6 


AD 2 Temporal Ctx 


25.2 


Control 3 Occipital Ctx 


8.4 


AD 3 Temporal Ctx 


3.9 


Control 4 Occipital Ctx 


6.3 


AD 4 Temporal Ctx 


7.5 


Control (Path) 1 Occipital Ctx 


83.5 


AD 5 Inf Temporal Ctx 


74.7 


Control (Path) 2 Occipital Ctx 


6.0 


AD 5 SupTemporal Ctx 


43.8 


Control (Path) 3 Occipital Ctx 


1.7 


AD 6 Inf Temporal Ctx 


71.2 


Control (Path) 4 Occipital Ctx 


13.1 


AD 6 Sup Temporal Ctx 


41.8 


Control 1 Parietal Ctx 


2.9 


Control 1 Temporal Ctx 


5.9 


Control 2 Parietal Ox 


30.1 


Control 2 Temporal Ctx 


45.1 


Control 3 Parietal Ctx 


12.3 


Control 3 Temporal Ctx 


12.0 


Control (Path) 1 Parietal Ctx 


100.0 


Control 4 Temporal Ctx 


6.7 


Control (Path) 2 Parietal Ctx 


12.6 


Control (Path) 1 Temporal Ctx 


47.3 


Control (Path) 3 Parietal Ox 


2.5 


Control (Path) 2 Temporal Ctx 


15.9 


Control (Path) 4 Parietal Ctx 


44.1 
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Table AOC. General screening panel vl.5 



Tissue Name 


Pol 

Exp.(%) 
Ag5264, 
Run 

232936651 


issue Name 


Rel. 

Exp.(%) 
Ag5264, 
Run 

232936651 


Adipose 


0.7 


Renal ca. TK-10 


3.6 


Melanoma* Hs688(A).T 


19.5 


Bladder 


3.8 


Melanoma* Hs688(B).T 


9.0 


Gastric ca. (liver met.) NCI-N87 


10.2 


Melanoma* M14 


24.7 


Gastric ca. KATO III 


5.5 


Melanoma* LOXMVI 


15.6 


Colon ca. SW-948 


1.2 


Melanoma* SK-MEL-5 


9.7 


Colon ca. SW480 


7.0 


Squamous cell carcinoma SCC-4 


3.1 


Colon ca.* (SW480 met) SW620 


2.0 


Testis Pool 


0.4 


Colon ca. HT29 


0.6 


Prostate ca.* (bone met) PC-3 


2.0 


Colon ca. HCT-116 


3.1 


Prostate Pool 


0.6 


Colon ca. CaCo-2 


5.2 


Placenta 


3.7 


Colon cancer tissue 


8.6 


Uterus Pool 


0.2 


Colon ca.SWl 116 


2.4 


Ovarian ca. OVCAR-3 


6.7 


Colon ca. Colo-205 


4.1 


Ovarian ca. SK-OV-3 


7.2 


Colon ca. SW-48 


1.3 


Ovarian ca. OVCAR-4 


4.2 


Colon Pool 


1.2 


Ovarian ca. OVCAR-5 


6.2 


Small Intestine Pool 


0.7 


Ovarian ca. IGROV-1 


1.5 


Stomach Pool 


1.3 


Ovarian ca. OVCAR-8 


2.2 


Bone Marrow Pool 


0.7 


Ovary 


1.4 


Fetal Heart 


0.5 


Breast ca. MCF-7 


2.7 


Heart Pool 


1.3 


Breast ca. MDA-MB-231 


4.9 


Lymph Node Pool 


2.2 


Breast ca. BT 549 


100.0 


Fetal Skeletal Muscle 


0.3 


Breast ca. T47D 


1.3 


Skeletal Muscle Pool 


1.3 


Breast ca. MDA-N 


1.1 


Spleen Pool 


1.2 


Breast Pool 


1.7 


Thymus Pool 


0.9 


Trachea 


3.0 


CNS cancer (glio/astro) U87-MG 


12.6 


Lung 


0.2 


CNS cancer (glio/astro) IM 18-MG 


9.0 


Fetal Lung 


1.6 


CNo cancer vneuro^netj oJv-rN-Ao 


Z.J. 


Lungca. NCI-N417 


0.2 


CNS cancer (astro) SF-539 


7.4 


Lung ca. LX-1 


4.5 


CNS cancer (astro) SNB-75 


22.5 


Lungca. NCI-H146 


0.2 


CNS cancer (glio) SNB-19 


1.7 


Lung ca. SHP-77 


1.6 


CNS cancer (glio) SF-295 


15.6 


Lung ca. A549 


4.1 


Brain (Amygdala) Pool 


1.4 


Lungca. NCI-H526 


0.2 


Brain (cerebellum) 


5.6 


Lung ca. NCI-H23 


2.2 


Brain (fetal) 


1.0 


Lung ca.NCI-H460 


1.2 


Brain (Hippocampus) Pool 


1.3 


Lung ca. HOP-62 


5.6 


Cerebral Cortex Pool 


1.6 
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Lungca. NCI-H522 


1.4 




,$137 : 


Liver 


1.7 


Brain (Thalamus) Pool 


2.1 


Fetal Liver 


4.9 


Brain (whole) 


3.1 


Liver ca. HepG2 


4.9 


Spinal Cord Pool 


1.6 


Kidney Pool 


2.4 


Adrenal Gland 


2.1 


Fetal Kidney 1 


1.0 


Pituitary gland Pool 


0.4 


Renal ca. 786-0 


1.0 


Salivary Gland 


L6 


Renal ca. A498 


1.7 


Thyroid (female) 


16.7 


Renal ca. ACHN 


4.0 


Pancreatic ca. CAPAN2 


5.6 


Renal ca. UO-31 


11.2 


Pancreas Pool 


2.8 



Table AOD. Panel 4.1D 



JLIfcoUt 111 lie 


Rel. 

tr.™. t of 

bxp.(% 
Ag5264, 
Run 

230472870 


Tissue Name 


ReL 

JC/A|f*^ /Of 

Ag5264, 
Run 

230472870 


Secondary Thl act 


4.0 


HUVECIL-lbeta 


9.2 


Secondary Th2 act 


33 


HUVEC IFN gamma 


7.2 


Secondary Trl act 


1.2 


HUVEC TNF alpha + IFN gamma 


4.6 


Secondary Thl rest 


0.3 


HUVEC TNF alpha + IL4 


5.1 


Secondary Th2 rest 


0.2 


HUVEC 3L-11 


4.5 


Secondary Trl rest 


0.2 


Lung Microvascular EC none 


32.5 


Primary Thl act 


0.5 


Lung Microvascular EC TNFalpha 
+ IL-lbeta 


10.3 


Primary Th2 act 


0.7 


Microvascular Dermal EC none 


4.2 


Primary Trl act 


1.0 


Microsvasular Dermal EC 
TNFalpha + IL-lbeta 


2.8 


Primary Thl rest 


0.2 


Bronchial epithelium TNFalpha + 
ILlbeta 


113 


Primary Th2 rest 


0.3 


Small airway epithelium none 


15.8 


Primary Trl rest 


0.2 


Small airway epithelium TNFalpha 
+ IL-lbeta 


20.2 


CD45RA CD4 lymphocyte act 


4.6 


Coronery artery SMC rest 


6.0 


CD45RO CD4 lymphocyte act 


1.7 


Coronery artery SMC TNFalpha + 
IL-lbeta 


5.1 


CD8 lymphocyte act 


0.3 


Astrocytes rest 


1.5 


Secondary CD8 lymphocyte rest 


1.1 


Astrocytes TNFalpha + IL-lbeta 


1.9 


Secondary CD8 lymphocyte act 


0.3 


KU-812 (Basophil) rest 


1.7 


CD4 lymphocyte none 


0.1 


KU-812 (Basophil) 
PMA/ionomycin 


8.9 


2ry Thimi2/Trl anti-CD95 
CH11 


0.8 


CCD 1 106 (Keratinocy tes) none 


6.8 
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LAK cells rest 


39.2 


CCD1 106 (Keralincrcytes) ,,a * '"'^ * 
TIMFalDha + IL-lbeta 


"-II "ILB'7' 
5.0 


LAK cells IL-2 


0.6 


T iv^r cirrhosis 




LAK cells IL-2+IL-12 


0.1 


NC1-H292 none 


3 6 


LAK cells BL-2+IFN gamma 


0.3 


NCI-H292 EL-4 


4.7 


LAK cells IL-2-f lL-lo 


0.3 


NCI-H292 1L-9 


5.4 


LAK cells PMA/ionomycin 


54.3 


XT/11 n to 

NCI-H292 IL-13 


3.3 


NK Cells IL-2 rest 


0.6 


viol T ¥0 TT^VT 

NCI-H292 EFN gamma 


2.4 


Two Way MLR 3 day 


9.0 


HPAEC none 


3.7 


Two Way MLR 5 day 


3.4 


TTT* A 1""**""* rpv T"T~* 11 T¥ 11 i 

HPAEC TNF alpha + DL-1 beta 


27.0 


Two Way MLR 7 day 


1.3 


Lung fibroblast none 


10.7 


PBMC rest 


0.4 


Lung fibroblast TNF alpha + IL-1 
beta 


10.4 


PBMC PWM 


0.7 


Lung fibroblast IL-4 


4.5 


PBMC PHA-L 


2.7 


Lung fibroblast IL-9 


8.2 


Ramos (B cell) none 


0.0 


Lung fibroblast IL-13 


2.2 


Ramos (B cell) ionomycin 


0.0 


Lung fibroblast DFN gamma 


16.0 


B lymphocytes PWM 


0.5 


Dermal fibroblast CCD1070 rest 


17.6 


d lympnocyies ^l/*hjjl» ana jju-*f 


1 .0 


Dermal fibroblast CCD1070 TNF 
alpha 


16 fx 

JLO.O 


EOL-1 dbcAMP 


1.0 


Dermal fibroblast CCD1070 IL-1 
beta 


16.7 


EOL-1 dbcAMP 
PMA/ionomycin 


0.9 


Dermal fibroblast IFN gamma 


31.6 


Dendritic cells none 


inn n 


L/ermai Tiuroojasi ul-h- 




Dendritic relK T PS 


31.9 


Dermal Fihrf>Wsi<it<i rp«rt 


14.6 


Dendritic cells an ti -CD 40 


36.3 


Neutrophils TNFa+LPS 


0.2 


Monocytes rest 


1.4 


Neutrophils rest i 


0.2 


Monocytes LPS 


40.9 


Colon 


0.0 


Macrophages rest 


26.1 


Lung 


1.4 


Macrophages LPS 


16.7 


Thymus 


0.2 


HUVEC none 


4.7 


Kidney 


9.7 


HUVEC starved 


5.8 







CNS_neurodegeneration_vl.O Summary: Ag5264 This panel confirms the 
expression of this gene at low levels in the brains of an independent group of individuals. 
5 However, no differential expression of this gene was detected between Alzheimer's 

diseased postmortem brains and those of non-demented controls in this experiment Please 
see Panel 1.5 for a discussion of the potential utility of this gene in treatment of central 
nervous system disorders. 

General jscreeiiingjjaneLvLS Summary: Ag5264 Highest expression of this 
10 gene is detected in breast cancer BT-549 cell line (CT=25). Moderate levels of expression 
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of this gene is also seen in cluster of cancer cell lines derittfi'flbift $$M^€e/ge£tob» 
colon, lung, liver, renal, breast, ovarian, prostate, melanoma and brain cancers. Thus, 
expression of this gene could be used as a marker to detect the presence of these cancers. 
Furthermore, therapeutic modulation of the expression or function of this gene may be 

5 effective in the treatment of pancreatic, gastric, colon, lung, liver, renal, breast, ovarian, 
prostate, melanoma and brain cancers. 

Among tissues with metabolic or endocrine function, this gene is expressed at 
moderate levels in pancreas, adipose, adrenal gland, thyroid, pituitary gland, skeletal 
muscle, heart, liver and the gastrointestinal tract. Therefore, therapeutic modulation of the 

10 activity of this gene may prove useful in the treatment of endocrine/metabolically related 
diseases, such as obesity and diabetes. 

In addition, this gene is expressed at moderate levels in all regions of the central 
nervous system examined, including amygdala, hippocampus, substantia nigra, thalamus, 
cerebellum, cerebral cortex, and spinal cord. Therefore, therapeutic modulation of this gene 

15 product may be useful in the treatment of central nervous system disorders such as 

Alzheimer's disease, Parkinson's disease, epilepsy, multiple sclerosis, schizophrenia and 
depression. 

Panel 4.1D Summary: Ag5264 Highest levels of expression of this gene is 
detected in resting dendritic cells (CT=28.7). Moderate to low levels of expression of this 

20 gene is also seen in activated dendritic cells, resting and PMA/ionomycin stimulated LAK 
cells, monocytes, macrophage, different types of endothelial cells, small airway epithelium, 
lung and dermal fibroblasts and normal tissue represent by lung and kidney. This gene is 
unregulated in LPS treated monocytes, cytokine treated HPAEC, and activated secondary 
Thl, Th2 cells. Therefore, therapeutic modulation of this gene or its protein product may 

25 alter the functions associated with these cell types and would be beneficial in the treatment 
of autoimmune and inflammatory diseases such as asthma, allergies, inflammatory bowel 
disease, lupus erythematosus, psoriasis, rheumatoid arthritis, and osteoarthritis. 

AP. CG57284-03: RAS-RELATED PROTEIN RAB-5C. 

Expression of gene CG57284-03 was assessed using the primer-probe set Ag$892, 
30 described in Table APA. Results of the RTQ-PCR runs are shown in Tables APB and APC. 
Please note that this sequence represents a full-length physical clone. 
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Table APA. Probe Name A26892 



Primers 




Length . 


Start 
Position 


SEQID 
No 


Forward 


5 1 -gtgtcatccaggcagacagtct-3 1 


22 


473 


414 


Probe 


TET-5 ' -ccgctccaattgtgctctcctggtac 
t-3 ' -TAMRA 


27 


507 


415 


Reverse 


5 ' -cgctttgtcaagggacagttt-3 1 


21 


538 


416 



Table APB. General screening panel vl.6 



Tissue Name 


Rel. 

Exp.(%) 
Ag6892, 
Run 


issue Name 


Rel. 

Exp.(%) 
Ag6892, 
Run 


Adipose 


1 1 0 




41.5 


Melanoma^ xisooo\/vj. i 




OiaUUGi 


19.1 


Melanoma* xisooo^jd^.i 


11 0 
JjAJ 




26.4 


JYlcl all OlXla. XVH't 


RS 1 


Gastric ca KATO HI 


93.3 


IVieiaiiOTiJa 1AJA11V1 V I 


*+0.v 


Colon ca SW-948 


15.7 


iVlClallUIIla oJx-lVUDJU-J 


40 7 


Colon ca SW480 


62.4 


Squamous cell carcinoma SCC-4 


28.5 


Colon ca.* (SW480 met) SW620 


9.5 


Testis Pool 


10.1 


Colon ca. HT29 


20.7 


Prostate ca * (bone met) PC-3 


0.0 


Colon ca. HCT-116 


48.0 


Prostate Pool 


10.6 


Colon ca. CaCo-2 


49.7 


Placenta 


22.4 


Colon cancer tissue 


19.3 


Uterus Pool 


4.8 


Colon ca.SW1116 


6.7 


Ovarian ca. OVCAR-3 


18.9 


Colon ca. Colo-205 


13.3 


Ovarian ca. SK-OV-3 


63.3 


Colon ca, SW^8 


16.5 


Ovarian ca. OVCAR-4 


17.4 


Colon Pool 


15.5 


Ovarian ca. OVCAR-5 


41.5 


Small Intestine Pool 


8.7 


Ovarian ca. IGROV-1 


18.4 


Stomach Pool 


8.0 


Ovarian ca. OVCAR-8 


13.8 


Bone Marrow Pool 


8.5 


Ovary 


10.6 


Fetal Heart 


5.9 


Breast ca. MCF-7 


33.2 


Heart Pool 


6.3 


Breast ca. MDA-MB-231 


46.0 


Lymph Node Pool 


16.4 


Breast ca. BT 549 


37.4 


Fetal Skeletal Muscle 


5.4 


Breast ca. T47D 


35.1 


Skeletal Muscle Pool 


1.6 


Breast ca. MDA-N 


22.2 


Spleen Pool 


8.8 


Breast Pool 


12.7 


Thvmus Pool 


8.7 


Trachea 


12.0 


CNS cancer (glio/astro) U87-MG 


35.4 


Lung 


2.5 


CNS cancer (glio/astro) U-l 18-MG 


55.9 
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Fetal Lung 


32.5 


CNS cancer gTeWW^WS^ 




Lung ca. NCI-N417 


5.4 


CNS cancer (astro) SF-539 


28.9 


Lung ca. LX-1 


20.2 


CNS cancer (astro) SNB-75 


52.9 


Lungca. NCI-H146 


8.6 


CNS .cancer (glio) SNB-19 


21.2 


Lung ca. SHP-77 


20.2 


CNS cancer (glio) SF-295 


100.0 


Lung ca. A549 


51.1 


Brain (Amygdala) Pool 


10.6 


Lungca.NCI-H526 


5.6 


Brain (cerebellum) 


49.0 


Lungca. NCI-H23 


23.7 


Brain (fetal) 


25.9 


Lungca. NC1-H460 


19.1 


Brain (Hippocampus) Pool 


13.0 


Lungca.HOP-62 


21.0 


Cerebral Cortex Pool 


17.3 


Lungca.NCI-H522 


31.4 


Brain (Substantia nigra) Pool 


11.2 


Liver 


5.7 


Brain (Thalamus) Pool 


19.6 


Fetal Liver 


19.8 


Brain (whole) 


23.0 


Liver ca. HepG2 


10.3 


Spinal Cord Pool 


12.5 


Kidney Pool 


15.9 


Adrenal Gland 


24.8 


Fetal Kidney 


14.0 


Pituitary gland Pool 


2.7 


Renal ca. 786-0 


24.3 


Salivary Gland 


11.3 


Renal ca. A498 


21.9 


Thyroid (female) 


9.8 


Renal ca. ACHN 


22.2 


Pancreatic ca. CAPAN2 


24.8 


Renal ca. UO-31 


35.4 


Pancreas Pool 


8.1 



Table APC. Panel 5 Islet 

5 



Tissue Name 


Rel. 

Exp.(%) 
Ag6892, 
Run 

305424859 


Tissue Name 


Rel. 

Exp.(%) 
Ag6892, 
Run 

305424859 


97457 JPatient-02go_adipose 


4.5 


94709 JDonor 2 AM - A_adipose 


44.1 


97476_Patient-07sk_skeletal 
muscle 


0.0 


94710 JDonor 2 AM - B_adipose 


30.8 


97477 JPatient-07ut_uterus 


8.2 


9471 1 JDonor 2 AM - C_adipose 


21.0 


97478_Patient-07pLp!acenta 


13.1 


94712JDonor 2 AD - A^adipose 


48.0 


99167 JBayer Patient 1 


23.2 


94713JDonor 2 AD - B_adipose 


54.0 


97482_Patient-08ut_uterus 


7.7 


94714JDonor 2 AD - C_adipose 


50.3 


97483 JPatient-08pLpiacenta 


18.9 


94742JDonor 3 U - AJMesenchymal 
Stem Cells 


14.7 


97486 J>atient-09sk_skeletal 
muscle 


4.4 


94743 _Donor 3 U - B Jvlesenchymal 
Stem Cells 


10.4 


97487 JPatient-09ut_uterus 


19.6 


94730JDonor 3 AM - A^adipose 


53.2 


97488JPatient-09pLplacenta 


11.3 


94731 JDonor 3 AM - B_adipose 


74.2 


97492 JPatient-10ut_uterus 


12.2 


94732 JDonor 3 AM - C_adipose 


58.6 


97493 J^tient-lOpLplacenta. 


34.9 


94733 JDonor 3 AD - A^adipose 


64.6 


97495 J^tient- 1 1 go_adipose 


9.2 


94734 JDonor 3 AD - B_adipose 


100.0 
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97496_Pauent-l lsk_skeletal 
muscle 


3.8 


IP H""' T 7 II II iU it ft nJ\ J 

94735 JDonor 3 AD - C_adipose 


.11 jF . 
20.4 


97497_Patient-l lut_uterus 


25.0 


77 1 38_Li ver_JHepG2untreated 


71.2 


97498 JPatient-1 lpLplacenta 


8.8 


73556_Heart„Cardiac stromal cells 
(primary) 


18.6 


97500_Patient-l 2go_adipose 


10.4 


81735_Small Intestine 


12.4 


97501J > atient-12sk u j;keletal 
muscle 


12.7 


72409_Kidney_ProximaI Convoluted 
Tubule 


81.2 


97502JPatient-12ut_uterus 


18.9 


82685_Small intestine_Duodenum 


8.1 


97503_Patient-12pLplacenta 


17.8 


90650_AdrenaLAdrenocortical 
adenoma 


4.8 


94721 _Ponor2U- 
Ajvfesenchymal Stem Cells ! 


27.9 


72410JCidneyJiRCE 


37.9 


94722_Donor2U- 

B ..Mesenchymal Stem Cells I 


25.7 


72411JCidneyJIRE 


18.8 


94723JDonor2U- j 
C_Mesenchymal Stem Cells ! 


30.4 


73139JJtenis_Uterine smooth 
muscle cells 


48.0 



General_screeningjpanel_vl.6 Summary: Ag6892 Highest expression of this 
gene is seen in a brain cancer cell line (CT=24.1). This gene is ubiquitously expressed in 
5 this panel, with high levels of expression seen in brain, colon, gastric, lung, breast, ovarian, 
and melanoma cancer cell lines. This expression profile suggests a role for this gene 
product in cell survival and proliferation. Modulation of this gene product may be useful in 
the treatment of cancer. 

Among tissues with metabolic function, this gene is expressed at high levels in 
10 pituitary, adipose, adrenal gland, pancreas, thyroid, and adult and fetal skeletal muscle, 
heart, and liver. This widespread expression among these tissues suggests that this gene 
product may play a role in normal neuroendocrine and metabolic function and that 
deregulated expression of this gene may contribute to neuroendocrine disorders or 
metabolic diseases, such as obesity and diabetes. 
15 This gene is also expressed at high levels in the CNS, including the hippocampus, 

thalamus, substantia nigra, amygdala, cerebellum and cerebral cortex. Therefore, 
therapeutic modulation of the expression or function of this gene may be useful in the 
treatment of neurologic disorders, such as Alzheimer's disease, Parkinson's disease, 
schizophrenia, multiple sclerosis, stroke and epilepsy. 

20 In addition, this gene is expressed at much higher levels in fetal lung tissue 

(CT=25.7) when compared to expression in the adult counterpart (CT=29.4). Thus, 
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expression of this gene may be used to differentiate betwfcetf-thfc tetararfd-adulf sourtetJP 
this tissue. 

Panel 5 Islet Summary: Ag6892 Highest expression is seen in adipose (CT=26), 
with nearly ubiquitous expression seen across the samples on this panel. High to moderate 
5 levels of expression are seen in metabolic tissues, including skeletal muscle, adipose, and 
placenta, in agreement with Panel 1.6. Please see that panel for discussion of this gene in 
metabolic disease. 

AQ. CG57308-02: Sulfonylurea Receptor 1 Splice Variant. 

Expression of gene CG57308-02 was assessed using the primer-probe set Ag7558, 
10 described in Table AQA. Results of the RTQ-PCR runs are shown in Tables AQB and 
AQC. 

Table AQA. Probe Name Ag7558 



Primers 




Length 


Start 
Position 


SEQID 
No 


Forward 


5 ' -tcgaagggcacatcatca-3 ' 


18 


4319 


417 


Probe 


TET-5 ■ -tgcctctgtccctggctgaaattctc 
-3 ' -TAMRA 


26 


4348 


418 


Reverse 


5 * -tgaagatgctggtcttcctca-3 • 


21 


4400 


419 



15 

Table AQB. CNS neurodegeneration vl.O 



Tissue Name 


Rel. 

Exp.(%) 
Ag7558, 
Run 

308750599 


issue Name 


Rel. 

Exp.(%) 
Ag7558> 
Run 

308750599 


AD 1 Hippo 


4.2 


Control (Path) 3 Temporal Ctx 


3.3 


AD 2 Hippo 


16.4 


Control (Path) 4 Temporal Ctx 


50.3 


AD 3 Hippo 


1.7 


AD 1 Occipital Ctx 


11.1 


AD 4 Hippo 


11.3 


AD 2 Occipital Ctx (Missing) 


0.0 


AD 5 Hippo 


76.3 


AD 3 Occipital Ctx 


2.3 


AD 6 Hippo 


38.7 


AD 4 Occipital Ctx 


19.8 


Control 2 Hippo 


17.8 


AD 5 Occipital Ctx 


45.4 


Control 4 Hippo 


3.9 


AD 6 Occipital Ctx 


21.2 


Control (Path) 3 Hippo 


1.0 


Control 1 Occipital Ctx 


0.9 


AD 1 Temporal Ctx 


7.6 


Control 2 Occipital Ctx 


82.4 
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AD 2 Temporal Ctx j 


24.5 


Control 3 ogki Jdbc U 00 ^ 




AD 3 Temporal Ctx 


4.0 


Control 4 Occipital Ctx 


0.0 


AD 4 Temporal Ctx 


32.3 


Control (Path) 1 Occipital Ctx 


100.0 


AD 5 Inf Temporal Ctx 


78.5 


Control (Path) 2 Occipital Ctx 


17.1 


AD 5 Sup Temporal Ctx 


25.3 


Control (Path) 3 Occipital Ctx 


0.0 


AD 6 Inf Temporal Ctx 


39.2 


Control (Path) 4 Occipital Ctx 


31.9 


AD 6 Sup Temporal Ctx 


71.7 


Control 1 Parietal Ctx 


1.8 


Control 1 Temporal Ctx_ 


4.3 


Control 2 Parietal Ctx 


36.9 


Control 2 Temporal Ctx 


33.2 


Control 3 Parietal Ctx 


21.5 


Control 3 Temporal Ctx 


13.8 


Control (Path) 1 Parietal Ctx 


87.1 


Control 3 Temporal Ctx 


2.5 


Control (Path) 2 Parietal Ctx 


41.5 


Control (Path) 1 Temporal Ctx 


55.9 


Control (Path) 3 Parietal Ctx 


3.7 


Control (Path) 2 Temporal Ctx 


65.1 


Control (Path) 4 Parietal Ctx 


79.0 



Table AOC. Panel 5 Islet 

5 



Tissue Name 


ReL 

Exp.(%) 
Ag7558, 
Run 

312000203 


Tissue Name 


Rel. 

Exp.(%) 
Ag7558, 
Run 

312000203 


97457_Patient-02go_adipose 


0.0 


94709 JDonor 2 AM - A_adipose 


0.0 


97476_JPatient-07sk_skeletal 
muscle 


0.0 


94710JDonor 2 AM - B_adipose 


0.0 


97477 JPatient-07ut_uterus 


0.0 


9471 l_Donor 2 AM - C_adipose 


0.0 


97478_Patient-07pLplacenta 


0.0 


94712_Donor 2 AD - A_adipose 


0.0 


9916733X61 Patient 1 


100.0 


947 1 3_Donor 2 AD - B_adipose 


0.0 


97482_Patient-08ut_uterus 


0.0 


94714_Donor 2 AD - C_adipose 


0.0 


97483JPatient-08pl_placenta 


0.0 


94742_Donor 3 U - A Jdesenchymal 
Stem Cells 


0.0 


97486_Patient-09sk_skeIetal 
muscle 


0.0 


94743_Donor 3 U - B_Mesenchymal 
Stem Cells 


0.0 


97487_Patient-09ut_uterus 


0.0 


94730 JDonor 3 AM - A_adipose 


0.0 


97488_Patient-09pLplacenta 


0.0 


94731_Donor 3 AM - B_adipose 


0.0 


97492JPatient-10ut w uterus 


0.0 


94732 JDonor 3 AM - C_adipose 


0.0 


97493 w Patient-10pLplacenta 


0.0 


94733_Donor 3 AD - A_adipose 


0.0 


97495_Patient-l lgo_adipose 


0.0 


94734JDonor 3 AD - B_adipose 


0.0 


97496_Patient-l lsk_skeletal 
muscle 


0.0 


94735_Donor 3 AD - C_adipose 


0.0 


97497 JPatient-1 lut_uterus 


0.0 


77138JLiver_HepG2untreated 


0.0 


97498_Patient-l lpLplacenta 


0.0 


73556_Heart_Cardiac stromal cells 
(primary) 


0.0 


97500_j > atient-12go_adipose 


0.0 


81735_Small Intestine 


0.0 
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97501 J > atient-12sk_skeleial 
muscle 


00 


72409_Kidnl5Sli«^hVto 
Tubule 


U.U 


3 


97502J > atient-12ut_uterus 


0.0 


82685_Small intestine JDuodenum 


0.0 




97503_Patient-12pLpIacenta 


0.0 


90650_Adrenal_Adrenocortical 
adenoma 


0.0 




94721JDonor2U- 
AJMesenchymal Stem Cells 


0.0 


72410JCidneyJHRCE 


0.0 




94722JDonor2U- 
BJMesenchymal Stem Cells 


0.0 


72411JKidneyJHRE 


0.0 




94723_Donor2U- 
CJMesenchymal Stem Cells 


0.0 


73139JJterusJJterine smooth 
muscle cells 


0.0 





CNS_neurodegeneration_vl.O Summary: Ag7558 Highest expression of this 
gene is seen in the occipital cortex of a control patient (CT=33). This panel does not show 
5 differential expression of this gene in Alzheimer's disease. However, this profile does show 
the expression of this gene at low levels in the brain. Therefore, therapeutic modulation of 
the expression or function of this gene may be useful in the treatment of neurological 
disorders, such as Alzheimer's disease, Parkinson's disease, schizophrenia, multiple 
sclerosis, stroke and epilepsy. 

10 Panel 4.1D Summary: Ag7558 Expression of this gene is low/undetectable in all 

samples on this panel (CTs>35). 

Panel 5 Islet Summary: Ag7558 Expression of this gene is limited to pancreatic 
islet cells (CT=34.6). This gene codes for a variant of SUR1. SUR1 is a subunit of the 
pancreatic beta cell K+ channel that regulates insulin release in glucose-stimulated cells. 
15 Thus, therapeutic modulation of SUR1 variant encoded by this gene may be used as a 
treatment for the enhancement of insulin secretion in Type 2 diabetes. 

AR. CG93659-03: MTTOGEN-ACTIVATED PROTEIN 
KINASE KINASE KINASE 9. 

Expression of gene CG93659-03 was assessed using the primer-probe set Ag4828, 
20 described in Table ARA. Results of the RTQ-PCR runs are shown in Tables ARB and 
ARC. 

Table ARA. Probe Name Ag4828 



Primers 




Length 


Start 
Position 


SEQID 
No 
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Forward 


pn 

5 ■ -gaggaatctgagatgctcaaga-3 ' 








Probe 


TET-5 1 -caacgctctctctacatcgacctcgg 
-3 1 -TAMRA 


26 


1299 


421 


Reverse 


5 ' -tccccgaacaagattgaagt-3 ' 


20 


1339 


422 



Table ARB. General screening panel vl.4 



Tissue Name 


Rel. 

A&4828. 
Run 

217081802 | 


issue Name 


ReL 

Pirn / CL\ 

Ag4828, 
Run 

217081802 


Adipose 


53.6 


Renal ca. TK-10 


10.6 


Melanoma* Hs688(A).T 


15.5 


Bladder 


31.9 


Melanoma* Hs688(B).T 


17.4 


Gastric ca. (liver met.) NCI-N87 


36.3 


Melanoma* M14 


3.5 


Gastric ca. KATO III 


12.2 


Melanoma* LOXIMVI 


3.2 


Colon ca. SW-948 


5.4 


Melanoma* SK-MEL-5 


0.9 


Colon ca. SW480 


25.0 


Squamous cell carcinoma SCC-4 


7.0 


Colon ca * (SW480 met) SW620 


2.5 


Testis Pool 


4.7 


Colon ca. HT29 


14.3 


Prostate ca.* (bone met) PC-3 


6.3 


Colon ca. HCT-116 


2.1 1 


Prostate Pool 


3.9 


Colon ca. CaCo-2 


15.9 


Placenta 


39.0 


Colon cancer tissue 


39.8 | 


Uterus Pool 


9.0 


Colon ca.SW1116 


3.4 


Ovarian ca. OVCAR-3 


15.7 


Colon ca. Colo-205 


8.8 


Ovarian ca. SK-OV-3 


46.3 


Colon ca. SW-48 


5.4 


Ovarian ca. OVCAR-4 


7.1 


Colon Pool 


16.2 


Ovarian ca. OVCAR-5 


30.6 


Small Intestine Pool 


9.3 


Ovarian ca. IGROV-1 


14.1 


Stomach Pool 


17.3 


Ovarian ca. OVCAR-8 


2.7 


Bone Marrow Pool 


7.0 


Ovary 


4.5 


Fetal Heart 


2.9 


Breast ca. MCF-7 


100.0 


Heart Pool 


7.9 


Breast ca. MDA-MB-23 1 


9.2 


Lymph Node Pool 


15.2 


Breast ca.BT549 


73.2 


Fetal Skeletal Muscle 


1.7 


Breast ca.T47D 


66.0 


Skeletal Muscle Pool 


9.8 


Breast ca. MDA-N 


0.9 


Spleen Pool 


45.7 


Breast Pool 


24.1 


Thymus Pool 


15.9 


Trachea 


18.0 


CNS cancer (glio/astro) U87-MG 


7.6 


Lung 


6.7 


CNS cancer (glio/astro) U-l 18-MG 


7.9 


Fetal Lung 


68.3 


CNS cancer (neuro;met) SK-N-AS 


2.6 


Lung ca. NCI-N417 


0.2 


CNS cancer (astro) SF-539 


2.3 


Lung ca. LX-1 


11.8 


CNS cancer (astro) SNB-75 


14.1 


Lungca. NCI-H146 


0.0 


CNS cancer (glio) SNB-19 


11.1 
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5 



Lungca.SHP-77 0 


.1 






Lung ca. A549 3 


6.6 


Brain (Amygdala) Pool 


2.7 


Lungca.NCI-H526 C 


10 


Brain (cerebellum) 


1.4 


Lungca.NCI-H23 1 


3.4 


Brain (fetal) 


4.9 


Lungca.NCI-H460 1 


7.6 


Brain (Hippocampus) Pool 


3.7 


Lungca.HOP-62 1 


3.2 


Cerebral Cortex Pool 


3.5 


Lungca.NCI-H522 2 


1.1 


Brain (Substantia nigra) Pool 


2.7 


Liver 


1.0 


Brain (Thalamus) Pool 


4.5 


Fetal Liver 1 


L.8 


Brain (whole) 


4.5 


Liver ca. HepG2 i 


$.1 


Spinal Cord Pool 


3.8 


Kidney Pool I 


U.4 


Adrenal Gland 


9.5 


Fetal Kidney 


7.7 


Pituitary gland Pool 


1.4 


Renal ca. 786-0 


10.9 


Salivary Gland 


2.5 


Renal ca. A498 


5.2 


Thyroid (female) 


7.7 


Renal ca. ACHN 


2.5 


Pancreatic ca. CAPAN2 


34.4 


Renal ca. UO-31 


14.9 


Pancreas Pool 


19.6 


Table ARC. Panel 5D 




Tissue Name 


Rel. 

Exp. %) 
Ag4828, 
Run 

219436967 


Tissue Name 


Rel. 

Exp.(%) 
Ag4828, 
Run 

219436967 


97457 JPatient-02go_adipose 


33.9 


94709_Donor 2 AM - A.adipose 


10.8 


97476JPatienM)7sk_skeletal 
muscle 


33.4 


94710J>onor 2 AM - B_adipose 


9.3 


97477 JPatient-07ut_uterus 


59.5 


94711_Donor 2 AM - C_adipose 


3.0 


97478 Patient-07pl placenta 


39.8 


94712J>onor 2 AD - A_adipose 


13.7 


9748 l_Patient-08slcskeletal 
muscle 


25.9 


94713JDonor 2 AD - B_adipose 


10.0 


97482 JPauent-08uUiterus 


19.8 


94714JDonor 2 AD - C_adipose 


6.7 


97483_Patient-08pLplacenta 


41.5 


94742_Donor 3 U - A_MesenchymaI 
Stem Cells 


4.7 


97486JPatient-09sk_skeletal 
muscle 


6.5 


94743_Donor 3 U - B_Mesenchymal 
Stem Cells 


2.8 


97487_Patient-09ut_uterus 


8.1 


94730 JDonor 3 AM - A_adipose 


6.3 


97488JPatient--09pLplacenta 


38.4 


94731JDonor 3 AM - B_adipose 


2.4 


97492JPatient-10ut_uterus 


30.6 


94732_Donor 3 AM - C_adipose 


2.2 


97493 J'atient-lOpLplacenta 


72.7 


94733_Donor 3 AD - A_adipose 


10.2 


97495_Patient-l lgo_adipose 


100.0 


94734_Donor 3 AD - B_adipose 


55 


97496_Patient-l lsk_skeletal 
muscle 


5.8 


94735 JDonor 3 AD - C_adipose 


4.7 


97497 Patient-llut uterus 


20.6 


77138JLiverJHepG2untreated 


14.4 
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97498_Patient-l lpl_j>lacenta 


50.0 


73556 JfeaifjCM 
(primary) 


st JL 3 / , 
1.9 


97500 JPatient-ngo.adipose 


82.4 


81735_Small Intestine 


17.2 


97501_Patient-12sk_skeletal 
muscle 


19.2 


72409_KidneyJProximal Convoluted 
Tubule 


fk Q 


97502 JPatient-12ut_uterus 


23.7 


82685_SmaIl intestine JDuodenum 


19.1 


97503JPatient-12pl_placenta 


57.0 


90650_Adrena!^Adrenocortical 
adenoma 


8.8 


94721_Donor2U- 
A__Mesenchymal Stem Cells 


1.6 


72410JKidneyJHROE 


7.6 


94722_J)onor2U- 

B .Mesenchymal Stem Cells 


3.0 


72411JCidneyJHRE 


13.5 


94723 JDonor2U- 
Cjvlesenchymal Stem Cells 


2.1 


73 1 39_Uterus_Uterine smooth 
muscle cells 


2.0 



GeneraLscreening_panel_vl.4 Summary: Ag4828 Highest expression of this 
gene is detected in a breast cancer MCF-7 cell line(CT— 27.6). Interestingly, this gene is 
expressed at much higher levels in fetal (CT=28) when compared to adult lung (CT=31). 
This observation suggests that expression of this gene can be used to distinguish fetal from 
adult lung. In addition, the relative overexpression of this gene in fetal lung suggests that 
the protein product may enhance lung growth or development in the fetu$ and thus may 
also act in a regenerative capacity in the adult. Therefore, therapeutic modulation of the 
protein encoded by this gene could be useful in treatment of lung related diseases. 

In addition significant expression of this gene is found in a number of cancer 
(pancreatic, CNS, colon, lung, breast, ovary, prostate, melanoma) cell lines. Therefore, 
therapeutic modulation of the activity of this gene or its protein product, through the use of 
small molecule drugs, might be beneficial in the treatment of these cancers. 

Among tissues with metabolic or endocrine function, this gene is expressed at high 
to moderate levels in pancreas, adipose, adrenal gland, thyroid, skeletal muscle, heart, fetal 
liver and the gastrointestinal tract. Therefore, therapeutic modulation of the activity of this 
gene may prove useful in the treatment of endocrine/metabolically related diseases, such as 
obesity and diabetes. 

This gene encodes a protein that is homologous to mitogen-activated protein kinase 
kinase kinase 8 (MAP3K8)(COT proto-oncogene serine/threonine-protein kinase) (C-COT) 
(Cancer osaka thyroid oncogene). COT is able to enhance the TNF alpha production and to 
activate NF-kB. Both events are connected with insulin resistance and type II diabetes (1, 
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2, 3). Inhibition of COT kinase would prevent overproducing 9Fl$W $$\Fin£ altilaSoiS' 3 
of NF-kB, thus improving insulin resistance and diabetes. 

In addition, this gene is expressed at high levels in all regions of the central nervous 
system examined, including amygdala, hippocampus, substantia nigra, thalamus, 
5 cerebellum, cerebral cortex, and spinal cord. Recently, MKK6, a related protein, has been 
shown to associated with Alzheimer's disease (4). Therefore, based on the homology of this 
protein to MKK6 and the presence of this gene in the brain, we predict that this putative 
MAP3K8 may play a role in central nervous system disorders such as Alzheimer's disease, 
Parkinson's disease, epilepsy, multiple sclerosis, schizophrenia and depression. 
10 References: 

1. Ballester A, Velasco A, Tobena R, Alemany S. Cot kinase activates tumor 
necrosis factor-alpha gene expression in a cyclosporin A-resistant manner. J. Biol. Chem. 
1998. 273, 14099-106. PMED: 9603908. 

2. Bierhaus A, Schiekofer S, Schwaninger M, Andrassy M, Humpert PM, Chen J, 
1 5 Hong M, Luther T, Henle T, Kloting I, Morcos M, Hofmann M, Tritschler H, Weigle B, 

Kasper M, Smith M, PeiTy G, Schmidt AM, Stem DM, Haring HU, Schleicher E, Nawroth 
PP. Diabetes-associated sustained activation of the transcription factor nuclear 
factor-kappaB, Diabetes, 2001 50, 2792-808. PMID: 11723063. 

3. Belich MP, Salmeron A, Johnston LH, Ley SC. TPL-2 kinase regulates the 
20 proteolysis of the NF-kappaB-inhibitory protein NF-kappaB 1 pl05. Nature. 1999 397, 

363-8.PMID: 9950430. 

4. Zhu X, Rottkamp CA, Hartzler A, Sun Z, Takeda A, Boux H, Shimohama S, 
Perry G, Smith MA. (2001) Activation of MKK6, an upstream activator of p38, in 
Alzheimer's disease. J Neurochem 79(2):31 1-8 

25 Panel 5D Summary: Ag4828 Highest expression of this gene is detected in 

adipose tissue (CT=29). Low to moderate expression of this gene is seen in wide range of 
samples used in this panel including adipose, skeletal muscle, uterus, and placenta. This 
wide spread expression of this gene in tissues with metabolic or endocrine function, 
suggests that this gene plays a role in endocrine/metabolically related diseases, such as 

30 obesity and diabetes. 

This gene encodes a MAP3K8-like protein. Recently, activation of MAP kinase, 
ERK, a related protein, by modified LDL in vascular smooth muscle cells has been 
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implicated in the development of atherosclerosis in diabefes J (RM)?tifiM^ith^ :tL 
putative MAP3K8 may also play a role in the development of this disease. Therefore, 
therapeutic modulation of the activity of this gene or its protein product, through the use of 
small molecule drugs, might be beneficial in the treatment of artherosclerosis and diabetes. 
5 References: 

1 . Velarde V, Jenkins AJ, Christopher J, Lyons TJ, Jaffa AA. (2001) Activation of 
MAPK by modified low-density lipoproteins in vascular smooth muscle cells. J Appl 
Physiol 91(3):1412-20 

AS. CG94521-02and CG94521-03: CYTOPLASMIC 
10 GLYCEROL-3-PHOSPHATE DEHYDROGENASE [NAD+]. 

Expression of gene CG94521-02 and CG94521-03 was assessed using the 
primer-probe set Ag3924, described in Table ASA. Results of the RTQ-PCR runs are 
shown in Tables ASB, ASC, ASD, ASE and ASF. Please note that, these sequences 
represent full-length physical clones. 
15 Table ASA. Probe Name Ag3924 



Primers 




Length 


Start 
Position 


SEQD) 
No 


Forward 


5 ' -actgggaagaccattgaagagt-3 ' 


22 


197 


423 


Probe 


TET-5 * -aaaagctccaaggaccgcagacttct 
-3 1 -TAMRA 


26 


147 


424 


Reverse 


5 1 -gtttgaggatgcggtacactt-3 * 


21 


122 


425 



20 Table ASB. CNS neurodegeneration vl.O 



Tissue Name 


Rel. 

Exp.(%) 
Ag3924, 
Run 

212343350 


issue Name 


Rel. 

Exp.(%) 
Ag3924, 
Run 

212343350 


AD 1 Hippo 


8.4 


Control (Path) 3 Temporal Ctx 


6.0 


AD 2 Hippo 


21.9 


Control (Path) 4 Temporal Ctx 


2.8 


AD 3 Hippo 


8.4 


AD 1 Occipital Ctx 


14.4 


AD 4 Hippo 


TS 


AD 2 Occipital Ctx (Missing) 


0.0 


AD 5 hippo 


92.7 


AD 3 Occipital Ctx 


4.8 


AD 6 Hippo 


24.5 


AD 4 Occipital Ctx 


14.0 
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Control 2 Hippo 


25.7 




•>»••». ..if **|^f '""it "r 

1*4^0 ^" 


Control 4 Hippo 


73 


AD 6 Occipital Ctx 


55.5 


Control (Path) 3 Hippo 


8.8 


Control 1 Occipital Ctx 


6.1 


AD 1 Temporal Ctx 


8.3 


Control 2 Occipital Ctx 


47.3 


AD 2 Temporal Ctx 


23.8 


Control 3 Occipital Ctx 


9.8 


AD 3 Temporal Ctx 


4.2 


Control 4 Occipital Ctx 


4.5 


AD 4 Temporal Ctx 


15.1 


Control (Path) 1 Occipital Ctx 


64.6 


AD 5 Inf Temporal Ctx 


100.0 


Control (Path) 2 Occipital Ctx 


8.6 


AD 5 SupTemporal Ctx 


32.3 


Control (Path) 3 Occipital Ctx 


3.9 


AD 6 Inf Temporal Ctx 


39.0 


Control (Path) 4 Occipital Ctx 


15.8 ! 


AD 6 Sup Temporal Ctx 


33.2 


Control 1 Parietal Ctx 


5.0 


Control 1 Temporal Ctx 


4.5 


Control 2 Parietal Ctx 


40.3 


Control 2 Temporal Ctx 


44.4 


Control 3 Parietal Ctx 


14.6 


Control 3 Temporal Ctx 


11.1 


Control (Path) 1 Parietal Ctx 


70.7 


Control 4 Temporal Ctx 


4.4 


Control (Path) 2 Parietal Ctx 


15.5 


Control (Path) 1 Temporal Ctx 


49.0 


Control (Path) 3 Parietal Ctx 


4.9 


Control (Path) 2 Temporal Ctx 


29.9 


Control (Path) 4 Parietal Ctx 


39.5 



Table ASC. General screening panel vl.4 

5 



Tissue Name 


Rel. 

Exp.(%) 
Ag3924, 
Run 

219515221 


issue Name 


Rel. 

Exp.(%) 
Ag3924, 
Run 

219515221 


Adipose 


14.0 


Renal ca. TK-10 


7.1 


Melanoma* Hs688(A).T 


3.6 


Bladder 


8.1 


Melanoma* Hs688(B).T 


4.9 


Gastric ca. (liver met.) NCI-N87 


7.7 


Melanoma* M14 


15.1 


Gastric ca. KATO m 


17.4 


Melanoma* LOXMVI 


6.2 


Colon ca. SW-948 


25.5 


Melanoma* SK-MEL-5 


37.6 


Colon ca. SW480 


28.3 


Squamous cell carcinoma SCG4 


1.1 


Colon ca * (SW480 met) SW620 


6.6 


Testis Pool 


6.3 


Colon ca. HT29 


4.1 


Prostate ca.* (bone met) PC-3 


47.0 


Colon ca.HCT-1 16 


25.0 


Prostate Pool 


18.6 


Colon ca. CaCo-2 


6.9 


Placenta 


6.3 


Colon cancer tissue 


7.6 


Uterus Pool 


5.1 


Colon ca. SW1116 


5.2 


Ovarian ca. OVCAR-3 


11.3 


Colon ca. Colo-205 


2.6 


Ovarian ca. SK-OV-3 


6.8 


Colon ca. SW-48 


4.4 


Ovarian ca. OVCAR-4 


12.2 


Colon Pool 


9.9 


Ovarian ca. OVCAR-5 


17.9 


Small Intestine Pool 


9.3 


Ovarian ca. IGROV-1 


8.2 


Stomach Pool 


5.2 


Ovarian ca. OVCAR-8 


3.5 


Bone Marrow Pool 


4.9 
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Ovary 


9.6 


Fetal Heart 8 ' GIV11SOB, 




Breast ca. MCF-7 


100.0 


Heart Pool 


23.7 


Breast ca. MDA-MB-231 


11.4 


Lymph Node Pool 


8.7 


Breast ca. BT 549 


11.4 


Fetal Skeletal Muscle 


11.2 


Breast ca. T47D 


40.9 


Skeletal Muscle Pool 


62.0 


Breast ca. MDA-N 


11.7 


Spleen Pool 


9.7 


Breast Pool 


8.3 


Thymus Pool 


5.8 


Trachea 


15.4 


CNS cancer (glio/astro) U87-MG 


18.2 


Lung | 


2.8 


CNS cancer (glio/astro) U-118-MG 


11.3 


Fetal Lung 


21.8 


CNS cancer (neuro;met) SK-N-AS 


6.6 


Lung ca. NCI-N417 


13.4 


CNS cancer (astro) SF-539 


4.0 


Lung ca. LX-1 


8.2 


CNS cancer (astro) SNB-75 


21.9 


Lung ca. NCI-H146 


4.5 


CNS cancer (glio) SNB-19 


7.6 


Lung ca. SHP-77 


13.3 


CNS cancer (glio) SF-295 


24.0 


Lung ca. A549 


16.6 


Brain (Amygdala) Pool 


11.4 


Lung ca. NQ-H526 


2.4 


Brain (cerebellum) 


10.2 


Lung ca. NCI-H23 


2.0 


Brain (fetal) 


27.2 


Lung ca. NCI-H460 


2.9 


Brain (Hippocampus) Pool 


11.6 


Lung ca. HOP-62 


6.6 


Cerebral Cortex Pool 


17.2 


Lungca. NCI-H522 


14.3 


Brain (Substantia nigra) Pool 


10.4 


Liver 


0.3 


Brain (Thalamus) Pool 


18.9 


Fetal Liver 


1.1 


Brain (whole) 


17.7 


Liver ca. HepG2 


3.4 


Spinal Cord Pool 


14.3 


Kidney Pool 


26.4 


Adrenal Gland 


37.9 


Fetal Kidney 


6.7 


Pituitary gland Pool 


5.0 


Renal ca. 786-0 


3.0 


Salivary Gland 


11.1 


Renal ca. A498 


1.4 


Thyroid (female) 


17.0 


Renal ca. ACHN 


2.5 


Pancreatic ca. CAPAN2 


2.8 


Renal ca.UO-31 


10.1 


Pancreas Pool 


13.3 



Table ASP. Panel 4.1D 



Tissue Name 


Rel. 
Exp.(% 
Ag3924, 
Run 

170552351 


Tissue Name 


Rel. 

Exp.(%) 
Ag3924, 
Run 

170552351 


Secondary Thl act 


33.9 


HUVECIL-lbeta 


19.6 


Secondary Th2 act 


35.4 


HUVECIFN gamma 


32.3 


Secondary Trl act 


29.3 


HUVEC TNF alpha + IFN gamma 


8.6 


Secondary Thl rest 


14.8 


HUVEC TNF alpha -f IL4 


19.1 


Secondary Th2 rest 


23.7 


HUVEC IL-11 


17.2 


Secondary Trl rest 


15.8 


Lung Microvascular EC none 


16.8 
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Primary Thl act 


31.0 


Lung MicroUclilM 
+ IL-lbeta 


3X37": 
11.0 


Primary Th2 act 


33.7 


Microvascular Dermal EC none 


27.7 


Primary Trl act 


33.9 


Microsvasular Dermal EC 
TNFalpha + IL-lbeta 


8.6 


Primary Thl rest 


27.4 


Bronchial epithelium TNFalpha + 
ILlbeta 


6.7 ; 


Primary Th2 rest 


15.3 


Small airway epithelium none 


4.7 


Primary Trl rest 


34.2 


Small airway epithelium TNFalpha 
+ IL-lbeta 


4.0 


CD45RA CD4 lymphocyte act 


17.4 


Coronery artery SMC rest 


8.1 


CD45RO CD4 lymphocyte act 


28.3 


Coronery artery SMC TNFalpha + 
IL-lbeta 


4.4 


CD8 lymphocyte act 


24.1 


Astrocytes rest 


16.4 


Secondary CD8 lymphocyte rest 


18.2 


Astrocytes TNFalpha + IL-lbeta 


11.9 


Secondary CD8 lymphocyte act 


15.2 


KU-812 (Basophil) rest 


37.1 


CD4 lymphocyte none 


12.8 


KU-812 (Basophil) 
PMA/ionomycin 


35.6 


2ry Thl/Th2/Trl_anti-CD95 


21.0 


CCD1106 (Keratinocytes) none 


9.5 


LAK cells rest 


17.8 


CCD1106 (Keratinocytes) 
TNFalnha + IL-lheta 


4.8 


LAK cells IL-2 


26.6 


T ivpt* Pirrlincic 


14 4 


LAK cells IL-2+IL-12 


17.8 


NCI-H292 none 


4? O 


LAK cells EL-2+1FN gamma 


17.8 


NCT-H292 IL-4 


57.0 


LJ\J\. ceilS I-L-Z+ IL-lo 




KT/T UinO TT C% 

NLI-H292 IL-9 


81.2 


i-AJv cens jrivi/vionomycin 


/.y 


INL1-H2y2 IL-li 


60.7 


NTEf f\»11c TT 1 root 

iNjv v^ens LLi-z rest 




JNCI-H292 IbN gamma 


39.0 


i wo vy ay ivLUK j aay 


1 f.J 


rlrArLL. none 


21.2 


i wo w ay ivluk o aay 


\n i 


HrAiiL IiNr alpha + 1L-1 beta 


13.4 


T\i/rk Wq\/ TV/TT P T /la^f 

i wo w ay xvjj^iv / aay 


1UU.U 


Lung fibroblast none 


lo.O 


PBMCrest 


15.6 


Lung fibroblast TNF alpha + IL-1 
oeta 


6.0 


PBMC PWM 


16.5 


Lung fibroblast IL-4 


19.5 


PBMC PHA-L 


13.8 


Lung fibroblast IL-9 


30.8 


Ramos (B cell) none 


64.6 


Lung fibroblast IL-13 


22.2 


Ramos (B cell) ionomycin 


70.2 


Lung fibroblast IFN gamma 


20.0 


B lymphocytes PWM 


23.8 


Dermal fibroblast CCD1070 rest 


12.5 


B lymphocytes CD40L and IL-4 


17.0 


Dermal fibroblast CCD1070 TNF 
alpha 


30.1 


EOL-1 dbcAMP 


10.8 


Dermal fibroblast CCD1070 IL-1 
beta 


5.4 


EOL-1 dbcAMP 
PMA/ionomycin 


2.2 


Dermal fibroblast IFN gamma 


8.2 


Dendritic cells none 


13.6 


Dermal fibroblast IL-4 


17.8 


Dendritic cells LPS 


4.5 


Dermal Fibroblasts rest 


20.0 
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Dendritic cells anti-CD40 


21.6 


Neutrophili^WEipM 3 OK 


2^j| «Jl» - 


Monocytes rest 


19.8 


Neutrophils rest 


3.6 


Monocytes LPS 


3.0 


Colon 


35.6 


Macrophages rest 


14.9 


Lung 


27.7 


Macrophages LPS 


1.7 


Thymus 


27.7 


HUVEC none 


16.7 


Kidney 


66.4 


HUVEC starved 


17.7 







Table ASE. Panel 5 Islet 

5 



Tissue Name 


Kei. 
Exp.(% 
Ag3924, 
Run 

268363571 


Tissue Name 


Exp.(%) 
Ag3924, 
Run 

268363571 


97457_Patient-02go_adipose 


18.2 


94709_J>onor 2 AM - A_adipose 


19.6 


97476_Patient-07sk_skeIetal 
muscle 


IU.O 






97477_Patient-07ut_uterus 


10.2 


9471 l_Donor 2 AM - C_adipose 


11.0 


97478_Patient-07pl_placenta 


17.0 


94712_Donor 2 AD - A_adipose 


9.5 


99167_Bayer Patient 1 


6.5 


94713 JDonor 2 AD - B_adipose 


21.9 


97482J > atient-08ut_uterus 


6.8 


94714j)onor 2 AD - C_adipose 


16.7 


97483_Patient-08pLplacenta 


11.7 


94742_Donor 3 U - AJslesenchymal 
Stem Cells 


1.8 


97486 JPatient-09sk_skeletal 
muscle 


10.6 


94743 _Donor 3 U - B Jdesenchymal 
Stem Cells 


1.7 


97487 JPatient-09ut_uterus 


12.0 


94730_Donor 3 AM - A_adipose 


19.6 


97488 JPatient-09pLplacenta 


15.4 


94731_Donor 3 AM - B_adipose 


12.5 


97492JPatient-10uUiterus 


12.9 


94732 JDonor 3 AM - C_adipose 


12.2 


97493j > atient-10pL_placenta 


29.5 


94733 J>onor 3 AD - A_adipose 


10.2 


97495_Patient-l lgo_adipose 


17.9 


94734 JDonor 3 AD - B„adipose 


9.2 


97496 _Patient-l 1 st-skeletal 
muscle 


70.7 


94735 Jtonor 3 AD - C_adipose 


8.9 


97497 JPatient-1 lut_uterus 


18.8 


77 1 38_Li ver JHepG2untreated 


11.1 


97498 JPatient-1 lpLplacenta 


10.3 


73556_Heart_Cardiac stromal cells 
(primary) 


5.2 


97500_Patient-12go_adipose 


31.9 


81735_Small Intestine 


15.9 


97501_PatienM2sk w skeletal 
muscle 


100.0 


72409_KidneyJ > roximal Convoluted 
Tubule 


6.5 


97502_Patient-12ut_uterus 


23.8 


82685_Small intestineJDuodenum 


17.0 


97503J > atient-12pLplacenta 


8.7 


90650^Adrenal^Adrenocortical 
adenoma 


14.4 


94721J>onor2U- 

A Mesenchymal Stem Cells 


3.9 


72410JGdneyJHRCE 


11.5 



524 



WO 03/029424 



PCT/US02/31373 



94722J)onor2U- 
BJvflesenchymal Stem Cells 



94723_Donor 2 U - 
^Mesenchymal Stem Cells 



2.8 



72411_Kidney_HRE 



4.8 



3.4 



73139_Uterus_Uterine smooth 
muscle cells 



2.1 



Table ASF, general oncology screening panel v 2.4 

5 



Tissue Name 


Rel. 

Exp.(%) 
Ag3924, 
Run 

268143856 


Tissue Nme 


ReL 

Exp.(%) 
Ag3924, 
Run 

268143856 


Colon cancer 1 


60.3 


Bladder NAT 2 


3.3 


Colon NAT 1 


29.7 


Bladder NAT 3 


2.4 


Colon cancer 2 


26.1 


Bladder NAT 4 


25.7 


Colon NAT 2 


60.7 


Prostate adenocarcinoma 1 


100.0 


Colon cancer 3 


88.9 


Prostate adenocarcinoma 2 


1 A £ 

14.6 


Colon NAT 3 


88.9 


Prostate adenocarcinoma 3 


86.5 ] 


Colon malignant cancer 4 


Oft (\ 


JrrOSlaie aUenOOaiClJlvJlIla H- 


^4 0 


Colon NAT 4 


29.5 


Prostate NAT 5 


26.2 


Lung cancer 1 


17.3 


Prostate adenocarcinoma 6 


24.5 


Lung NAT 1 


7.9 


Prostate adenocarcinoma 7 


39.5 


Lung cancer 2 


31.9 


Prostate adenocarcinoma 8 


15.2 


Lung NAT 2 


14.8 


Prostate adenocarcinoma 9 


53.6 


Squamous cell carcinoma 3 


34.2 


Prostate NAT 10 


12.6 


Lung NAT 3 


5.0 


Kidney cancer 1 


12.0 


Metastatic melanoma 1 


28.3 


Kidney NAT 1 


25.9 


Melanoma 2 


4.8 


Kidney cancer 2 


53.6 


Melanoma 3 


12.9 


Kidney NAT 2 


64.6 


Metastatic melanoma 4 


42.6 


Kidney cancer 3 


12.5 


Metastatic melanoma 5 


70.7 


Kidney NAT 3 


26.6 


Bladder cancer 1 


9.3 


Kidney cancer 4 


15.0 


Bladder NAT 1 


0.0 


Kidney NAT 4 


14.6 


Bladder cancer 2 


17.7 







CNS_neurodegeneration_vl.O Summary: Ag3924 This panel does not show 
differential expression of this gene in Alzheimer's disease. However, this profile confirms 
10 the expression of this gene at moderate levels in the brain. Please see Panel 1.4 for 
discussion of this gene in the central nervous system. 

General_screening_paneLvl.4 Summary: Ag3924 Highest expression of this 
gene is seen in a breast cancer cell line (CT=25.3). This gene is ubiquitously expressed in 
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this panel, with high to moderate expression seen in braiilpclrto^ - 
ovarian, and melanoma cancer cell lines. This expression profile suggests a role for this 
gene product in cell survival and proliferation. Modulation of this gene product may be 
useful in the treatment of cancer. 

5 Among tissues with metabolic function, this gene is expressed at moderate to high 

levels in pituitary, adipose, adrenal gland, pancreas, thyroid, and adult and fetal skeletal 
muscle, heart, and liver. This widespread expression among these tissues suggests that this 
gene product may play a role in normal neuroendocrine and metabolic function and that 
disregulated expression of this gene may contribute to neuroendocrine disorders or 

10 metabolic diseases, such as obesity and diabetes. This gene encodes a novel glycerol 
3-phosphate dehydrogenase (G3PD). 

Similar to known cytosolic glycerol 3-phosphate dehydrogenase, this putative 
G3PD may contribute to glycerol synthesis and link glycolysis with TG production. This 
gene is highly expressed in skeletal muscle and diabetic skeletal muscle on Panel 51. 

15 Diabetic skeletal muscle has increased glycolytic activity and increased lipid content that 
interfere with insulin sensitivity. Inhibition of G3PD may balance disproportionate 
glycolysis and impair accumulation of TG in skeletal muscle. Thus, an antagonist of this 
novel G3PD may be beneficial for the treatment of diabetes. 

This gene is also expressed at high to moderate levels in the CNS, including the 

20 hippocampus, thalamus, substantia nigra, amygdala, cerebellum and cerebral cortex. 

Therefore, therapeutic modulation of the expression or function of this gene may be useful 
in the treatment of neurologic disorders, such as Alzheimer's disease, Parkinson's disease, 
schizophrenia, multiple sclerosis, stroke and epilepsy. 

In addition, this gene is expressed at much higher levels in fetal lung tissue 

25 (CT=27.5) when compared to expression in the adult counterpart (CT=30.5). Thus, 

expression of this gene may be used to differentiate between the fetal and adult source of 
this tissue. 

Panel 4.1D Summary: Ag3924 Highest expression is seen in a sample derived 
from an MLR, where the sample was take 7 days after the reaction (CT=27.6). This gene is 
30 also expressed at high to moderate levels in a wide range of cell types of significance in the 
immune response in health and disease. These cells include members of the T-cell, B-cell, 
endothelial cell, macrophage/monocyte, and peripheral blood mononuclear cell family, as 
well as epithelial and fibroblast cell types from lung and skin, and normal tissues 
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represented by colon, lung, thymus and kidney. This ubi(]S^^pit4]^lft^ii^siW' 
suggests that this gene product may be involved in homeostatic processes for these and 
other cell types and tissues. This pattern is in agreement with the expression profile in 
General_screening_panel_vl.4 and also suggests a role for the gene product in cell survival 

5 and proliferation. Therefore, modulation of the gene product with a functional therapeutic 
may lead to the alteration of functions associated with these cell types and lead to 
improvement of the symptoms of patients suffering from autoimmune and inflammatory 
diseases such as asthma, allergies, inflammatory bowel disease, lupus erythematosus, 
psoriasis, rheumatoid arthritis, and osteoarthritis. 

10 Panel 5 Islet Summary: Ag3924 Highest expression is seen in skeletal muscle 

from a diabetic patient (patient 12) (CT=28). This panel confirms expression of this gene in 
metabolic tissues including adipose, skeletal muscle and placenta. Please see Panel 1.4 for 
discussion of this gene in metabolic disease. 

general oncology screening panel_v_2.4 Summary: Ag3924 Highest expression 

15 is seen in a prostate cancer sample (CT=28.2), Prominent expression is also seen in 
melanoma samples, as well as in normal and malignant kidney, colon and lung. Thus, 
modulation of this gene may be useful in the treatment of prostate cancer and melanoma. 

AT. CG96613-02 and CG96613-03: Splice variant of PDK1. 

Expression of gene CG96613-02 and CG96613-03 was assessed using the 
20 primer-probe sets Agl778 and Ag51 10, described in Tables ATA and ATB. Results of the 
RTQ-PCR runs are shown in Tables ATC, AID, ATE, ATF, ATG and ATH. Please note 
that probe-primer set Agl778 is specific for CG96613-03. 
Table ATA. Probe Name Ael778 



Primers 




Length 


Start 
Position 


SEQID 
No 


Forward 


5 1 -gattgcccatatcacgtcttta-3 '* 


22 


1241 


426 


Probe 


TET-5 ' -cgcacaatacttccaaggagacctga 
-3--TAMRA 


26 


1263 


427 


Reverse 


5 1 -gataactgcatctgtcccgtaa-3 * 


22 


1308 


428 
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Table ATB. Probe Name AgSllO 



Primers 




Length 


Start 
Position 


SEQID 
No 


Forward 


5 ' -tgtatggcctgcaagatgat-3 * j 


20 


559 


429 


Probe 


TET-5 1 -tcattcccacaatggcccagg-3 ' 
-TAMRA 


21 


623 


430 


Reverse 


5 ' -agctctccttgtattcaatcaca-3 1 


23 


645 


431 



5 

Table ATC.CNS neurodegeneration vl.O 



Tissue Name 


ReL 

Exp.(%) 
Agl778, 

Run 
[villi 

276596797 


Rel. 

Exp.(%) 
Ag5110, 

Run 

226442922 


ReK 

t erf \ 

Exp.(%) 
AgSllO, 
Run 

276596798 


Tissue Name 


ReK 

Exp.<%) 
Agl778, 
Run 

27659679 
7 


ReK 

Exp.(%) 
AgSllO, 
Run 

22644292 
2 


ReK 

Exp.(%) 
AgSllO, 
Run 

27659679 
8 


AD 1 Hippo 


11.7 


6.2 


5.3 


Control 
(Path) 3 
Temporal 
Ctx 


6.6 


12.2 


17.7 


AD 2 Hippo 


31.4 


7.4 


20.3 


Control 
(Path) 4 
Temporal 
Ctx 


33.4 


15.8 


13.3 


AD 3 Hippo 


12.5 


5.3 


4.9 


AD 1 

Occipital Ctx 


23.0 


7.7 


8.0 


AD 4 Hippo 


5.4 


9.4 


0.0 


AD2 

Occipital Ctx 
(Missing) 


0.0 


0.0 


0.0 


AD 5 Hippo 


82.4 


79.0 


45.4 


AD3 

Occipital Ctx 


12.2 


6.2 


5.8 


AD 6 Hippo 


54.3 


88.3 


70.2 


AD4 

Occipital Ctx 


16.3 


18.0 


7.0 


Control 2 
Hippo 


17.9 


18.8 


19.5 


AD 5 

Occipital Ctx 


77.9 


29.9 


26.2 


Control 4 
Hippo 


13.0- 


193 


13.3 


AD6 

Occipital Ctx 


36.9 


18.9 


18.8 


Control 
(Path) 3 
Hippo 


11.0 


7.5 


16.3 


Control 1 
Occipital Ctx 


6.2 


6.8 


5.4 


AD 1 

Temporal 

Ctx 


20.3 


14.6 


11.0 


Control 2 
Occipital Ctx 


54.0 


44.8 


51.4 


AD2 

Temporal 

Ctx 


29.9 


16.6 


21.8 


Control 3 
Occipital Ctx 


32.3 


4.9 


26.8 
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AD 3 | 
Temporal 
Ctx | 


11.7 I 


$.4 


17.7 


pi: 

^ontroi 4 
Dccipital Ctx 


. IV-USi 
7.5 \ 




5.3 


AD4 

Temporal 

Ctx 


102 


5.6 


19.6 


Control 
(Path) 1 
Occipital Ctx 


50.3 


24.7 


41.8 


ADS Inf 
Temporal 
Ctx 


72.2 


47.0 


46.3 


Control 
(Path) 2 
Occipital Ctx 


12.8 


9.2 


6.3 


AD 5 Sup 
Temporal 
Ctx 


39.5 


51.1 


44.1 


Control 
(Path) 3 
Occipital Ctx 


5.5 


0.9 


0.0 


AD 6 Inf 
Temporal 
Ctx 


75.3 


84.1 


84.1 


Control 
(Path) 4 
Occipital Ctx 


16.6 


15.5 


12.3 


AD 6 Sun 
Temporal 
Ctx 


100.0 


100,0 


100.0 


Control 1 
Parietal Ctx 


10.0 


10.0 


3.6 


PnTitml 1 

Temporal 
Ctx 


11.2 


10.4 


3.9 


Control 2 
Parietal Ctx 


46.0 


57.0 


27.5 


V-> \JU Li \J1 £m 

Temporal 
Ctx 


25.3 


21.6 


36.3 


Control 3 
Parietal Ctx 


23.5 


18.3 


16.6 


Control 3 

""To rrxn rtTfi 1 
ICilJyUl tXl 

Ctx 


31.2 


37.9 


38.2 


Control 
(Path) 1 
Parietal Ctx 


78.5 


39.2 


52.5 


Control 3 
Temporal 
Ctx 


11.7 


8.4 


8.8 


Control 
(Path) 2 
Parietal Ctx 


23.5 


12.5 


14.9 


Control 
(Path) 1 
Temporal 
Ctx 


36.6 


53.6 


46.7 


Control 
(Path) 3 
Parietal Ctx 


9.5 


13.9 


5.8 


Control 
(Path) 2 
Temporal 
Ctx 


46.0 


29.7 


32.5 


Control 
(Path) 4 
Parietal Ctx 


46.0 


58.6 


39.2 



Table ATP. General screening panel vl.5 



Tissue Name 


Rel. 

Exp.(%) 
AgSllO, 
Run 

228980585 


issue Name 


Rel. 

Exp.(%) 
Ag5110, 
Run 

228980585 


Adipose 


5.4 


Renal ca. TK-10 


11.7 


Melanoma* Hs688(A).T 


10.7 


Bladder 


12.2 
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Melanoma* Hs688(B).T i 


>.8 


i o ip "tp «** 11 11 laHPft IP y\ 

3astric ca. (liver te.)Wtt-tWF- * 




Melanoma* M14 


19.5 


3astric ca. KATO III 


10.6 


Melanoma* LOXIMVI 


17.3 


Colon ca. SW-948 


2.6 


Melanoma* SK-MEL-5 


29.9 


Colon ca. S W480 


16.6 


Squamous cell carcinoma SCC-4 


4.2 


Colon ca * (SW480 met) S W620 


10.8 


Testis Pool 


9.2 


Colon ca. HT29 


17.0 ' 


Prostate ca * (bone met) PC-3 


48.0 


Colon ca.HCT-1 16 


6.7 


Prostate Pool 


0.6 


Colon ca. CaCo-2 


9.8 


Placenta 


0.5 


Colon cancer tissue 


7.1 


Uterus Pool 


23 


Colonca.SW1116 


2.5 


Ovarian ca. OVCAR-3 


5.5 


Colon ca. CoIo-205 


3.5 


Ovarian ca. SK-OV-3 


11.8 


Colon ca. S W-48 


4.7 


Ovarian ca. OVCAR-4 


7.9 


Colon Pool 


0.8 


Qvarian ca. OVCAR-5 


17.4 


Small Intestine Pool 


1.2 


Ovarian ca. IGROV-1 


8.7 


Stomach Pool 


2.2 


Ovarian ca. OVCAR-8 


8.2 


Bone Marrow Pool _J 1 .2 


Ovary 


0.3 


r^Clal XxCaJL 


13.0 


Breast ca. MCF-7 


4.3 


Heart Pool 


4.0 


Breast ca. MDA-MB-231 


25.0 


T vmnh Mr\Hp Pont 


0.9 


Breast ca. BT 549 


21.3 


"Ff»tnl ^kflptal "Muscle 1 


0.6 


Breast ca.T47D 


2.7 


Skeletal Muscle Pool 


1.7 


Breast ca. MDA-N 


17.2 


Spleen Pool 


7.5 


Breast Pool 


0.7 


Thymus Pool 


11.6 


Trachea 


21.9 


CNS cancer (glio/astro) U87-MG 


48.3 


Lung 


1.2 


CNS cancer (glio/astro) U-l 18-MG 


71.7 


Fetal Lung 


4.0 


CNS cancer (neuro;met) SK-N-AS 


7.2 


Lung ca. NCI-N417 


11.3 


CNS cancer (astro) SF-539 


16.6 


Lung ca. LX-1 


20.3 


CNS cancer (astro) SNB-75 


24.7 


Lungca. NCI-H146 


5.5 


CNS cancer (glio) SNB-19 


11.0 


Lung ca. SHP-77 


17.7 


CNS cancer (glio)SF-295 


27.5 


Lung ca. A549 


6.9 


Brain (Amygdala) Pool 


2.0 


Lungca. NCI-H526 


11.9 


Brain (cerebellum) 


5.2 


Lungca. NCI-H23 


4.7 


Brain (fetal) 


1.0 


Lungca. NCI-H460 


32.3 


Brain (Hippocampus) Pool 


2.0 


Lungca. HOP-62 


9.7 


Cerebral Cortex Pool 


1.9 


Lungca. NCI-H522 


12.8 


Brain (Substantia nigra) Pool 


1.6 


Liver 


0.4 


Brain (Thalamus) Pool 


1.7 


Fetal Liver 


100.0 


Brain (whole) 


10 


Liver ca. HepG2 


15.4 


Spinal Cord Pool 


1.0 


Kidney Pool 


1.6 


Adrenal Gland 


14.9 


Fetal Kidney 


2.2 


Pituitary gland Pool 


0.4 


Renal ca. 786-0 


10.5 


Salivary Gland 


6.1 


Renal ca. A498 


0.2 


Thyroid (female) 


0.5 


Renal ca. ACHN 


8.4 


Pancreatic ca. CAPAN2 


2.6 
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Renal ca. UO-31 



13.7 jPancreas ggg » » O B ^ ' j 



Table ATE. General screening panel vl.6 



Tissue Name 


Rel. 

Exp.(%) 
Agl778, 
Run 

277218713 


Rel. 

Exp.(%) 
Ag5110, 

Dim 

277218715 


issue Name 


Rel. 

Exp.(%) 
Agl778, 

KUD 

277218713 


Rel. 

Exp.(%) 
Ag5110, 
kud 

277218715 


Adipose 


8.8 


8.7 


Renal ca. TK-10 


31.6 


13.5 


Melanoma* 
Hs688(A).T 


45.1 


15.5 


Bladder 


23.3 


14.5 


Melanoma* 
Hs688(B).T 


34.6 


11.7 


Gastric ca diver 
met.) NCI-N87 


22.1 


5.0 


Melanoma* M14 


29.3 


11.6 


Gastric ca. KATO 

m 


9.0 


15.3 


Melanoma* 
LOXMVI 


16.6 


32.1 


Colon ca. SW-948 


9.2 


A A 

4.4 


Melanoma* 
SK-MEL-5 


23.0 


36.9 


Colon ca. o W480 






Squamous cell 
carcinoma SCC-4 


16.6 


7.2 


Colon ca* (SW480 
met) SW620 


24.0 


1 1 ft 
11.9 


Testis Pool 


8.9 


8.5 


Colon ca. HT29 


32.1 


21.5 


Prostate ca.* (bone 
met) PC-3 


100.0 


50.7 


Colon ca.HCT-1 16 


17.9 


9.3 


Prostate Pool 


5.7 


1.7 


Colon ca. CaCo-2 


21.6 


13.5 


Placenta 


1.6 


0.3 


Colon cancer tissue 


3.2 


10.5 


Uterus Pool 


3.5 


3.1 


Colon ca. SW1116 


3.8 


2.7 


Ovarian ca. 
OVCAR-3 


11.6 


9.5 


Colon ca. Colo-205 


6.7 


4.5 


Ovarian ca. SK-OV-3 


33.0 


20.3 


Colon ca. SWM8 


12.1 


5.2 


Ovarian ca. 
OVCAR-4 


11.4 


10.7 


Colon Pool 


6.6 


1.8 


Ovarian ca. 
OVCAR-5 


28.1 


24.8 


Small Intestine Pool 


9.0 


3.0 


Ovarian ca. 
IGROV-1 


29.1 


12.7 


Stomach Pool 


5.6 


4.5 


Ovarian ca. 
OVCAR-8 


15.9 


0.1 


Bone Marrow Pool 


5.1 


2.4 


Ovary 


4.4 


1.6 


Fetal Heart 


61.6 


26.4 


Breast ca. MCF-7 


5.9 


3.6 


Heart Pool 


6.8 


8.8 


Breast ca. 
MDA-MB-231 


79.0 


34.4 


Lymph Node Pool 


10.4 


0.8 


Breast ca. BT 549 


35.6 


15.9 


Fetal Skeletal 
Muscle 


6.6 


0.6 
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Breast ca. T47D 


3.0 


3.4 


Skeletal Mfiscle" " - 
Pool 


"usorsr 

0.9 


*' jnK «1L ..nit . * 
0.7 


Breast ca. MDA-N 


20.7 


20.9 


Spleen Pool 


19.2 


13.0 


Breast Pool 


7.4 


1.9 


Thymus Pool 


20.2 


12.6 


Trachea 


23.8 


33.7 


ujNo cancer 

(glio/astro) 

U87-MG 


47.0 


51.1 


T 111*1(7 


4.6 


1.0 


CNS cancer 
( elio/astrol 
U-118-MG 


43.2 


100.0 


Fetal Lung 


17.4 


8.1 


CNS cancer 
(neuro;met) 
SK-N-AS 


14.1 


7.7 


Lung ca. NCI-N417 


16.2 


16.0 


CNS cancer (astro) 
SF-539 


35.1 


28.3 


Lung ca. LX-1 


38.7 


8.8 


CNS cancer (astro) 
SNB-75 


50.3 


30.8 


Lungca. NCI-H146 


16.7 


5.9 


CNS cancer (glio) 
SNB-19 


34.4 


13.1 


Lung ca. SHP-77 


53.2 


25.9 


CNS cancer (glio) 
SF-295 


93.3 


46.0 


Lung ca. A549 


10.9 


9.9 


Brain (Amygdala) 
Pool 


7.7 


2.3 


Lungca. NCI-H526 


10.1 


10.9 


Brain (cerebellum) 


24.7 


5.3 


Lung ca. NCI-H23 


12.2 


9.2 


Brain (fetal) 


9.7 


1.3 


Lung ca. NCI-H460 


57.4 


57.8 


Brain 

(Hippocampus) 
Pool 


9.7 


2.8 


L<ung ca. n\jr-\)£ 




9 7 


Cerebral Cortex 
Pool 


9.6 


3.3 


Lung ca. NCI-H522 


19.5 


13.3 


Brain (Substantia 
nigra) Pool 


6.0 


2.8 


Liver 


1.5 


0.6 


Brain (Thalamus) 
Pool 


15.3 


1.9 


Fetal Liver 


15.1 


6.0 


Bram (whole) 


9.5 


3.3 


i_»iver ca. xiepijz 




1R 9 






2.1 


Kidney Pool 


9.6 


2.0 


Adrenal Gland 


27.5 


23.3 


Fetal Kidney 


14.7 


2.6 


Pituitary gland Pool 


2.5 


1:0 


Renal ca. 786-0 


14.5 


11.0 


Salivary Gland 


9.8 


10.4 


Renal ca. A498 


2.2 


0.9 


Thyroid (female) 


1.5 


1.9 


Renal ca. ACHN 


9.5 


10.8 


Pancreatic ca. 
CAPAN2 


9.7 


5.3 


Renal ca. UO-31 


13.4 


4.6 


Pancreas Pool 


18.0 


7.2 
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Table ATF. Panel 1.3D 



1 r 

] 

Tissue Name 


Rel. 

Agl778, 
Run 

157790405 


Tissue Name 


Rel. 

Agl778, 
Run 

157790405 


Liver adenocarcinoma 


6.7 


Kidney (fetal) 


12.1 


Pancreas 


1.3 


Renal ca. 786-0 


6.8 


Pancreatic ca. CAPAN 2 


2.1 


Renal ca. A498 


12.2 


Adrenal gland 


18.7 


Renal ca. RXF 393 


15.0 


Thyroid 


2.9 


Renal ca. ACHN 


3.2 


Salivary gland 


6.2 


Renal ca.UO-31 


8.4 


Pituitary gland 


5.7 


Renal ca. TK-10 


3.6 


Brain (fetal) 


2.5 


Liver 


3.0 


Brain (whole) 


4.8 


Liver (fetal) 


14.7 


Brain (amygdala) 


6.3 


Liver ca. (hepatoblast) HepG2 


25.5 


Brain (cerebellum) 


5.4 


Lung 


13.7 


Brain (hippocampus) 


22.8 


Lung (fetal) 


5.3 


Brain (substantia nigra) 


1.1 


Lung ca. (small cell) LX-1 


14.5 


Brain (thalamus) 


3.3 


Lung ca. (small cell) NCI-H69 


4.9 


Cerebral Cortex 


14.7 


Lung ca. (s.cell var.) SHP-77 


36.1 


Spinal cord 


2.3 


Lung ca. (large cell)NCI-H460 


12.9 


glio/astro U87-MG 


21.6 


Lung ca. (non-sm. cell) A549 


8.1 


glio/astroU-118-MG 


56.3 


Lung ca. (non-s.cell) NCI-H23 


7.3 


astrocytoma SW1783 


31.2 


Lungca. (non-s.cell) HOP-62 


12.8 


neuro*; met SK-N-AS 


30.4 


Lungca. (non-s.cl) NCI-H522 


4.5 


astrocytoma SF-539 


22.2 


Lung ca. (squam.) SW 900 


1.5 


astrocytoma SNB-75 


12.6 


Lung ca. (squam.) NCI-H596 


0.7 


glioma SNB-19 


29.9 


Mammary gland 


9.7 


glioma U251 


22.2 


Breast ca * (pl.ef) MCF-7 


4.6 


glioma SF-295 


20.3 


Breast ca.* (pl.ef) MDA-MB-231 


100.0 1 


Heart (fetal) 


35.4 


Breast ca.* (pl.ef) T47D 


5.1 


Heart 


4.5 


Breast ca. BT-549 


45.1 


Skeletal muscle (fetal) 


26.1 


Breast ca. MDA-N 


28.9 


Skeletal muscle 


3.1 


Ovary 


4,0 


Bone marrow 


13.1 


Ovarian ca. OVCAR-3 


4.5 


Thymus 


6.2 


Ovarian ca. OVCAR-4 


3.5 


Spleen 


15.5 


Ovarian ca. OVCAR-5 


13.4 


Lymph node 


16.3 


Ovarian ca. OVCAR-8 


3.1 


Colorectal 


7.9 


Ovarian ca. IGROV-1 


4.2 


Stomach 


14.5 


Ovarian ca * (ascites) SK-OV-3 


13.2 


Small intestine 


15.5 


Uterus 


3.1 


Colon ca. SW480 


9.7 


Placenta 


4.3 
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Colon ca.* SW620(SW480 met) 


10.7 


'Prostate PCT/ttSDg 


J« -3 ^ - 


Colon ca. HT29 


25.5 


Prostate ca.* (bone met)PC-3 


16.7 


Colon ca. HCT-116 


5.1 


Testis 


20.2 


Colon ca. CaCo-2 


8.1 


Melanoma Hs688(A).T 


7.1 


Colon ca. tissue(OD03866) 


8.4 


Melanoma* (met) Hs688(B).T 


3.8 


Colon ca.HCC-2998 


12.2 


Melanoma UACC-62 


2.0 


Gastric ca * (liver met) NCI-N87 


11.1 


Melanoma M 14 


11.4 


Bladder 


8.0 


Melanoma LOX1MV1 


10.8 


Trachea 


17.7 


Melanoma* (met) SK-MEL-5 


5.2 


Kidney 


0.7 


Adipose 


4.9 



Table ATG. Panel 4.1D 





KeJ* 

Exp.(%) 
Ael778, 
Run 

276596860 


K€l, 

Exp.(%) 
Agl778, 
Run 

276686878 


Pol 

Exp.(%) 
AgSllO, 
Run 

226444095 


tvei. 

Exp.(%) 
Ag5110, 
Run 

276596862 


Rpl 

Exp,(%) 
AgSllO, 
Run 

276686880 


Secondary Thl act 


23.5 


26.8 


13.9 


14.9 


9.0 


Secondary Th2 act 


28.7 


28.1 


11.4 


14.8 


17.9 


Secondary Trl act 


5.4 


8.4 


7.9 


1.9 


4.5 


Secondary Thl rest 


2.9 


3.8 


6.3 


1.0 


1.5 


Secondary Th2 rest 


7.4 


4.3 


11.3 


4.3 


2.7 


Secondary Trl rest 


4.3 


4.9 


6.6 


4.8 


1.4 


Primary Thl act 


4.5 


5.6 


13.9 


5.0 


1.8 


Primary Th2 act 


23.2 


16.8 


14.4 


14.4 


16.5 


Primary Trl act 


22.2 


23.3 


13.9 


11.1 


12.3 


Primary Thl rest 


3.1 


3.3 


2.2 


0.0 


0.0 


Primary Th2 rest 


6.8 


4.2 


5.6 


0.0 


0.0 


Primary Trl rest 


2.6 


3.6 


10.3 


0.7 


0.0 


CD45RA CD4 
lymphocyte act 


25.5 


26.4 


9.5 


18.3 


16.2 


CD45RO CD4 
lymphocyte act 


40.1 


27.2 


22.1 


27.9 


22.4 


CD8 lymphocyte act 


5.1 


7.4 


13.1 


8.1 


2.4 


Secondary CD8 
lymphocyte rest 


3.3 


5.1 


20.9 


32.3 


5.1 


Secondary CD8 
lymphocyte act 


4.3 


3.7 


3.3 


1.3 


0.0 


CD4 lymphocyte none 


13.3 


8.6 


13.7 


4.3 


4.9 


2ry 

Thl/Th2/Trl_anti-CD95 
CH11 


3.2 


5.2 


8.1 


3.1 


2.4 


LAK cells rest 


13.2 


6.7 


10.1 


5.6 


4.6 
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LAK cells JL-2 (9.1 J 


5.U 


LI. 1 l 




;3137ii 

J.J 


LAK cells IL-2+1L-12 |0.8 


1 1 


1 1 n 


1 7 ( 




LAK cells IL-2+IFN ( 
gamma 


).2 


3.5 j 


12.2 


*.8 


7.6 


LAK cells IL-2+ IL-18 


5.4 


5.1 


15.6 


3.7 


12.2 


LAK cells 
PMA/ionomycin 


L00.0 


100,0 


100.0 


100.0 


100.0 


NK Cells DL-2 rest 


27.5 


17.8 


8.7 


7.1 


14.7 


Two Way MLR 3 day 


16.8 


21.2 


16.3 


5.1 


10.7 


Two Way MLR 5 day 


2.9 


2.7 


4.2 


1.7 


0.0 


Two Way MLR 7 day 


6.2 


2.6 


3.4 


1.9 


2.6 


PBMC rest 


3.6 


3.7 


5.9 


2.3 


3.2 


PBMCPWM 


9.5 


69 


4.5 


1.7 


1.6 


PBMC PHA-L 


6.9 


8.0 


8.7 


5.0 


3.4 


Ramos (B cell) none 


7.7 


4.2 


A 1 
4./ 


u.o 


1 A 
1.4 


Ramos (B cell) ionomycin 


36.6 


32.1 


n.y 






B lymphocytes PWM 


11.7 


4.9 


0.7 


4.4 


4.J 


B lymphocytes CD40L 
and IL-4 




21.0 


13.2 


15.2 


19.8 


EOL-1 dbcAMP 


52.1 


34 4 


11.0 


10.8 


15.6 


EOL-1 dbcAMP 
PMA/ionomycin 


9.8 


6.0 


3.5 


1.4 


5.8 


Dendritic cells none 


9.5 


7.7 


7.3 


6.3 


5.4 


Dendritic cells LPS 


5.6 


5.0 


6.6 


1.1 


2.0 


Dendritic cells anti-CD40 


3.6 


4.2 


7.0 


1.3 


1.5 


Monocytes rest 


4.9 


3.1 


6.9 


1.2 


0.0 


Monocytes LPS 


11.3 


8.4 


6.8 


2.9 


0.0 


Macrophages rest 


5.7 


10 2 


5.7 


1.9 


0.0 


Macrophages LPS 


3.2 


3.0 


5.2 


0.7 


3.6 


HUVEC none 


6.0 


4.2 


1.8 


1.3 


5.2 


HUVEC starved 


11.0 


9.5 


4.4 


5.9 


2.3 


HUVEC IL-lbeta 


11.9 


1A 1 

1U.1 


4,y 


Q 1 
O.l 






9.2 


9.4 


5.5 


2.7 


6.5 


HUVEC TNF alpha + TFN 


3.8 


3.6 


4.1 


3.5 


1.8 


HUVEC TNF alpha + TL4 


2.7 


2.8 


5.5 


0.0 


0.0 


HUVEC IL-11 


4.3 


5.3 


3.5 


3.4 


0.0 


Lung Microvascular EC 
none 


25.3 


23.3 


7.5 


6.9 


6.2 


Lung Microvascular EC 
TNFalpha + IL-lbeta 


9.2 


7.0 


7.9 


2.6 


2.2 


Microvascular Dermal EC 
none 


1.8 


2.1 


3.8 


0.0 


0.0 


Microsvasular Dermal EC 
TNFalpha + IL-lbeta 


2.0 


2.6 


1.9 


1.3 


0.0 
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Bronchial epithelium 
TNFalpha + ILlbela 


ft ft 

o.o 


id ft 


1 PCTl 

1ft 


f US OB. i 
j.j 


D.D 


Small airway epithelium 
none 


10.7 


3.0 


2.4 


3.4 


6.0 


Small airway epithelium 
TNFalpha + IL-lbeta 


31.9 


31.0 


21.9 


30.4 


15.8 


Coronery artery SMC rest 


25.2 


19.6 


9.1 


13.3 


13.4 


Coronery artery SMC 
TNFalpha + IL-lbeta 


27.5 


19.6 


5.5 


7.8 


15.2 


Astrocytes rest 


8.2 


15.3 


2.4 


1.9 


2.8 


Astrocytes TNFalpha + 
IL-lbeta 


*z 7 


7 7 
L.J 


1 A 
DM 


ft ft 

U.v/ 


^ i 
j.j 


KU-812 (Basophil) rest 


10.7 


8.1 


3.5 


2.0 


0.0 


KU-812 (Basophil) 
PMA/ionomycin 


37.1 


25.5 


11.6 


8.9 


5.2 


CCD1 106 (Keratinocytes) 
none 


20.6 


20.9 


13.2 


4.5 


6.9 


LCD 1 106 (Keratinocytes) 
TNFalnha + IL-lbeta 


14.1 


22.7 


17.8 


7.7 


2.3 


Liver cirrhosis 


11.4 


8.5 


7.4 


1.4 


1.4 


NCI-H292 none 


12.9 


7.6 


7.1 


5.5 


7.5 ! 


NCI-H292 1L-4 


1 1 A 

11.9 


12.2 


A *> 

4.3 


A O 

4.8 


c o 
5.5 


NCI-H292 IL-9 


16.8 


12.7 


7.0 


3.7 


11.4 


NCI-H292IL-13 


12.5 


10.0 


6.5 


4.2 


7.3 


NCI-H292IFN gamma 


3.9 


4.1 


7.6 


2.6 


4.2 


HPAEC none 


1.7 


2.9 


2.6 


0.0 


0.0 


HPAECTNF alpha + IL-1 
beta 


10.6 


7.2 


2.9 


2.7 


3.3 


Lung fibroblast none 


31.2 


24.1 


4.5 


8.7 


5.8 


Lung fibroblast TNF 
alpha + EL- 1 beta 


24.3 


21.6 


6.6 


7.5 


11.2 


Lung fibroblast IL-4 


6.5 


1.1 


1.8 


3.2 


4.0 


Lung fibroblast IL-9 


19.2 


28.3 


8.2 


6.7 


7.7 


Lung fibroblast IL-13 


8.2 


5.1 


2.9 


0.0 


3.6 


Lung fibroblast IFN 
gamma 


15.3 


14.9 


5.5 


3.8 


12.9 


Dermal fibroblast 
CCD 1070 rest 


i* ft 


73 % 


7 fl 


A (L 


iin 


Dermal fibroblast 
CCD 1070 TNF alpha 


74.2 


45.1 


14.1 


23.2 


36.3 


Dermal fibroblast 
CCD1070 IL-1 beta 


23.3 


22.4 


4.3 


3.9 


5.7 


Dermal fibroblast IFN 
gamma 


3.4 


3.9 


2.0 


0.9 


0.0 


Dermal fibroblast EL-4 


6.8 


8.2 


3.3 


2.6 


3.0 


Dermal Fibroblasts rest 


11.2 


7.8 


2.8 


2.7 


3.8 


Neutrophils TNFa+LPS 


4.5 


1.6 


1.6 


1.8 


0.0 
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Neutrophils rest 


28.9 


31.2 


12.1 P«=T 






Colon 


2.3 


1.5 


2.3 


0.0 


2.3 


Lung 


2.0 


2.4 


3.7 


0.9 


1.6 


Thymus 


13.0 


14.6 


6.6 


0.0 


5.1 1 


Kidney 


7.9 


7.5 


1.7 


1.1 


2.8 



Table ATH. general oncology screening panel v 2.4 



tissue ISame 


Rel. 

Exp.(%) 
Run 

259939210 


i issue iNine 


Rel. 

Exp.(%) 

/Yg31AU, 

Run 

259939210 


Colon cancer 1 


6.5 


Bladder NAT 2 


0.0 


Colon NAT 1 


5.9 


Bladder NAT 3 


0.0 


Colon cancer 2 


6.0 


Bladder NAT 4 


0.0 


Colon NAT 2 


14.2 


Prostate adenocarcinoma 1 


1.2 


Colon cancer 3 


23.7 


Prostate adenocarcinoma 2 


0.0 


Colon NAT 3 


15.7 


Prostate adenocarcinoma 3 


1.6 


Colon malignant cancer 4 


41.5 


Prostate adenocarcinoma 4 


14.2 


Colon NAT 4 


4.2 


Prostate NAT 5 


0.9 


Lung cancer 1 


7.5 


Prostate adenocarcinoma 6 


0.0 


Lung NAT i 


0.0 


Prostate adenocarcinoma 7 


0.7 


Lung cancer 2 


28.5 


Prostate adenocarcinoma 8 


0.0 


Lung NAT 2 


1.2 


Prostate adenocarcinoma 9 


3.0 


Squamous cell carcinoma 3 


42.3 


Prostate NAT 10 


0.0 


Lung NAT 3 


0.0 


Kidney cancer 1 


34.2 


Metastatic melanoma 1 


1.4 


Kidney NAT 1 


4.5 


Melanoma 2 


10.4 


Kidney cancer 2 


100.0 


Melanoma 3 


2.1 


Kidney NAT 2 


3.2 


Metastatic melanoma 4 


2.2 


Kidney cancer 3 


19.6 


Metastatic melanoma 5 


4.5 


Kidney NAT 3 


1.1 


Bladder cancer 1 


0.0 


Kidney cancer 4 


37.1 


Bladder NAT 1 


0.0 


Kidney NAT 4 


1.0 


Bladder cancer 2 


2.3 







CNS_neurodegeneration_vl.O Summary: Agl778/Ag5110 This panel confirms 
the expression of this gene at low levels in the brains of an independent group of 
individuals. However, no differential expression of this gene was detected between 
Alzheimer's diseased postmortem brains and those of non-demented controls in this 
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experiment. Please see Panel 1.5 for a discussion of this ^^^t^M^RSc^iSif'- 3 
nervous system disorders. 

General_screening_panel_vl.5 Summary: Ag5110 Highest expression of this 
gene is detected in fetal liver (CT=29.4). Interestingly, this gene is expressed at much 
5 higher levels in fetal when compared to adult liver (CT=37). This observation suggests that 
expression of this gene can be used to distinguish fetal from adult liver. In addition, the 
relative overexpression of this gene in fetal tissue suggests that the protein product may 
enhance liver growth or development in the fetus and thus may also act in a regenerative 
capacity in the adult. Therefore, therapeutic modulation of the protein encoded by this gene 

10 could be useful in treatment of liver related diseases. 

Among tissues with metabolic or endocrine function, this gene is expressed at low 
levels in adipose, adrenal gland, heart, fetal liver and stomach. This gene codes for a splice 
variant of pyruvate dehydrogenase [lipoamide] kinase (PDK). Pyruvate dehydrogenase 
kinase (PDK) catalyzes phosphorylation and inactivation of the pyruvate dehydrogenase 

15 complex (PDC). Inactivation of PDC by increased PDK activity promotes gluconeogenesis 
by conserving three-caibon substrates. This helps maintain glucose levels during starvation, 
but is detrimental in diabetes (Huang et al., 2002, Diabetes 51(2):276-83, PME): 
1 1812733). Therefore, therapeutic modulation of the activity of PKD encoded by gene may 
be useful in the treatment of endocrine/metabolically related diseases, such as obesity and 

20 diabetes. 

In addition, this gene is expressed at low levels in cerebellum and whole brain. 
Therefore, therapeutic modulation of this gene product may be useful in the treatment of 
neurological disorders such as Alzheimer's disease, Parkinson's disease, epilepsy, multiple 
sclerosis, schizophrenia and depression. 

25 Moderate to low levels of expression of this gene is also seen in cluster of cancer 

cell lines derived from pancreatic, gastric, colon, lung, liver, renal, breast, ovarian, prostate, 
squamous cell carcinoma, melanoma and brain cancers. Thus, expression of this gene could 
be used as a marker to detect the presence of these cancers. Furthermore, therapeutic 
modulation of the expression or function of this gene may be effective in the treatment of 

30 pancreatic, gastric, colon, lung, liver, renal, breast, ovarian, prostate, squamous cell 
carcinoma, melanoma and brain cancers. 

General^screenIng.paneLvl.6 Summary: Agl778/Ag51 10 Two experiments 
with different probe and primer sets are in good agreement. Highest expression of this gene 
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is detected in a prostate cancer PC3 and a brain cancer 

(CTs=25-29.8). Expression in this panel correlates with pattern seen in panel 1.5. Moderate 
to low levels of expression of this gene is detected in tissues with metabolic/endocrine 
functions such as pancreas, adipose, adrenal gland, heart, fetal liver and gastrointestinal 

5 tract, in brain including cerebellum, cerebral cortex, substantia nigra and the whole brain 
and also in number of cancer cell lines derived from pancreatic, gastric, colon, lung, liver, 
renal, breast, ovarian, prostate, squamous cell carcinoma, melanoma and brain cancers. 
Please see panel 1.5 for further discussion on the utility of this gene. 

Panel L3D Summary: Agl778 Highest expression of this gene is detected in a 

10 breast cancer cell line (CT=27.4). Expression in this panel correlates with pattern seen in 
panel 1.5. Moderate to low levels of expression of this gene is detected in tissues with 
metabolic/endocrine functions such as pancreas, adrenal gland, heart, fetal liver and 
gastrointestinal tract, in brain including cerebellum, cerebral cortex, substantia nigra and 
the whole brain and also in number of cancer cell lines derived from pancreatic, gastric, 

15 colon, lung, liver, renal, breast, ovarian, prostate, squamous cell carcinoma, melanoma and 
brain cancers. Please see panel 1.5 for further discussion of this gene. 

Panel 4.1D Summary: Agl778/Ag51 10 Five experiments with the two different 
probe-primer sets are in good agreement. Highest expression of this gene is detected in 
PMA/ionomycin treated LAK cells. These cells are involved in tumor immunology and cell 

20 clearance of virally and bacterial infected ceils as well as tumors. Therefore, modulation of 
the function of the protein encoded by this gene through the application of a small molecule 
drug or antibody may alter the functions of these cells and lead to improvement of 
symptoms associated with these conditions. 

Low levels of expression of this gene is also seen in naive and memory T cells, 

25 resting secondary CD8 lymphocytes, cytokine activated small airway epithelium, and 
resting neutrophils. Therefore, therapeutic modulation of this gene or its protein product 
may be useful in the treatment of Therefore, therapeutic modulation of this gene product 
may ameliorate symptoms/conditions associated with autoimmune and inflammatory 
disorders including psoriasis, allergy, asthma, inflammatory bowel disease, rheumatoid 

30 arthritis and osteoarthritis 

general oncology screening panel_v_2.4 Summary: Ag5110 Highest expression 
of this gene is detected in kidney cancer (CT=32). Low levels of expression of this gene is 
also seen in colon, lung, prostate and kidney cancer. Higher levels of expression of this 
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gene is associated with cancer as compared to correspondinglidnli^sgtf^'Thferelbife,- 
expression of this gene may be used as diagnostic marker for the detection of these cancers. 
Furthermore, therapeutic modulation of this gene or its protein product may be useful in the 
treatment of colon, lung, prostate and kidney cancers. 

AU. CG96736-01: Neutral amino acid transporter B. 

Expression of gene CG96736-01 was assessed using the primer-probe sets Ag3788 
and Ag4075 } described in Tables AUA and AUB. Results of the RTQ-PCR runs are shown 
in Tables AUC, AUD, AUE, AUF, AUG, AUH, AUI, AUJ and AUK. 

Table AUA. Probe Name Ag3788 



Primers 




Length 


Start 
Position 


SEQID 
No 


Forward 


5 ' -cgagaaatatcttcccttccaa-3 ' 


22 


1182 


432 


Probe 


TET-5 ' -tgtcagcagcctttcgctcatactct 
-3 '-TAMRA 


26 


1209 


433 


Reverse 


5 ' -ttccggtgatattcctctcttc-3 ' 


22 


1244 


434 



Table AUB. Probe Name Ag4075 



Primers 




Length 


Start 
Position 


SEQ ID 
No 


Forward 


5 ' -cgagaaatatcttcccttccaa-3 * 


22 


1182 


435 


Probe 


TET-5 • -tgtcagcagcctttcgctcatactct 
-3 ' -TAMRA 


26 


1209 


436 


Reverse 


5 ' -ttccggtgatattcctctcttc-3 1 


22 


1244 


437 



Table AUC. AI comprehensive panel vl .O 
20 



Tissue Name 


Rel. 

Exp.(%) 
Ag4075, 
Run 

226203371 


issue Name 


Rel. 

Exp.(%) 
Ag4075, 
Run 

226203371 


110967 COPD-F 


6.0 


112427 Match Control Psoriasis-F 


12.3 


110980 COPD-F 


9.9 


1 12418 Psoriasis-M 


3.6 


110968 COPD-M 


6.6 


112723 Match Control Psoriasis-M 


6.3 


110977 COPD-M 


0.0 


112419 Psoriasis-M 


6.5 
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1 10989 Emphysema-F 


0.7 


1 12424 Match Control rsonasrs-W 


T ^«J< «JI« M.JI .it ,, 


110992 Emphysema-F 


12.3 


1 12420 Psonasis-M 


14.1 


110993 Emphysema-F 


7.2 


112425 Match Control Psoriasis-M 


6.7 


110994 Emphysema-F 


4.6 


104689 (MF) OA Bone-Backus 


21.6 


110995 Emphysema-F 


20.3 


104690 (MF) Adj "Normal" 
Bone-Backus 


21.8 


110996 Emphysema-F 


7.1 


104691 (MF) OA Synovium-Backus 


14.1 


110997 Asthma-M 


2.5 


104692 (BA) OA Cartilage-Backus 


53.6 


111001 Asthma-F 


6.7 


104694 (BA) OA Bone-Backus 


14.8 


111002 Asthma-F 


5.7 


104695 (BA) Adj "Normal" 
Bone-Backus 


28.7 


111003 Atopic Asthma-F 


11.0 


104696 (BA) OA Synovium-Backus 


15.8 


1 11004 Atopic Asthma-F 


13.3 


104700 (SS) OA Bone-Backus 


11.6 


111005 Atopic Asthma-F 


12.2 


104701 (SS) Adj "Normal" 

DUI) C-JD dCKUS 


12.7 


1 1 1flfW\ AfrvnJr» Aethrni* T? 
1 1 lvA/O AlOpiC nSilUlld-r 


Z.O 


lut/ox vp^/ oynovium-JOaCKus 


97 s 


1 1 1 A 1*7 A llarmi A/T 

11141 / Allergy -M 


/.O 


1 1 /uio cartilage Kep / 


O.J 


1 izoh- / Allergy -M 


\).\J 


1 1 96*79 OA RnnP^ 

1 izo /z r>oneo 




1 izJ4y XNonnai iiUng-r 




1 1 zo / 0 oynoviurro 


t A 
1.4 


112357 Normal Lung-F 


19.9 


112674 OA Synovial Fluid cells5 


3.0 


112354 Normal Lung-M 


4.0 


1 17100 OA Cartilage Repl4 


4.0 


1 12374 Crohns-F 


2.7 


112756 OA Bone9 


100,0 


1 12389 Match Control Crohns-F 


9.3 


112757 OA Synovium9 


0.9 


112375 Crohns-F 


2.0 


1 12758 OA Synovial Fluid Cells9 


3.8 


112732 Match Control Crohns-F 


12.6 


117125 RA Cartilage Rep2 


9.0 


112725 Crohns-M 


0.3 


113492 Bone2 RA 


8.1 


112387 Match Control 
Crohns-M 






9 ^ 


112378 Crohns-M 


0.0 


1 13494 Syn Fluid Cells RA 


5.3 


112390 Match Control 
Crohns-M 


\3Af 




u. / 


112726 Crohns-M 


9.9 


113500 Bone4 RA 


7.0 


112731 Match Control 
Crohns-M 


K 1 


^vnoviiimd R A 
1 ijjui 0 j iiu viuiji*t xvrv 


44 


112380 Ulcer Col-F 


6.0 


1 13502 Syn Fluid Cells4 RA 


3.2 


1 12734 Match Control Ulcer 
Col-F 


91 0 


1 1 ^4QS Parti la ere'* T? A 


U.J 


112384 Ulcer Col-F 


14.1 


113496 Bone3 RA 


8.4 


1 12737 Match Control Ulcer 
Col-F 


3.4 


113497 Synovium3 RA 


5.1 


112386 Ulcer Col-F 


3.4 


1 13498 Syn Fluid Cells3 RA 


7.9 


112738 Match Control Ulcer 
Col-F 


18.0 


117106 Normal Cartilage Rep20 


8.7 


112381 Ulcer Col-M 


0.0 


113663 Bone3 Normal 


0.0 


112735 Match Control Ulcer 
Col-M 


0.5 


113664 Synovium3 Normal 


0.0 
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112382 Ulcer Col-M 


7.1 


113665 ^R^'lUaPB. 


q jjjft JL . 


112394 Match Control Ulcer 
Col-M 


1.6 


1 17107 Normal Cartilage Rep22 


1.8 


112383 Ulcer Col-M 


13.1 


113667 Bone4 Normal 


2.4 


112736 Match Control Ulcer 
Col-M 


3.8 


113668 Synovium4 Normal 


1.7 


112423 Psoriasis-F 


6.3 


113669 Syn Fluid Cells4 Normal 


3.9 



Table AUD- CNS neurodegeneration vl.O 



Tissue Name 


ExoA %) 
Ag4075, 
Run 

214294982 


issue Name 


JcyKI* 

Exp,(%) 
Ag4075, 
Run 

214294982 


AD 1 Hippo 


11.0 


Control (Path) 3 Temporal Ctx 


1.0 


AD 2 Hippo 


8.4 


Control (Path) 4 Temporal Ctx 


1.7 


AD 3 Hippo 


8.0 


AD 1 Occipital Ctx 


6.5 


AD 4 Hippo 


2.9 


AD 2 Occipital Ctx (Missing) 


0.0 


AD 5 Hippo 


16.8 


AD 3 Occipital Ltx 


i.j 


AD 6 Hippo 


100.0 


AD 4 Occipital Ctx 


3.6 


Control 2 Hippo 


19.6 


AD 5 Occipital Ctx 


11.9 


Control 4 Hippo 


17.6 


AD 6 Occipital Ctx 


6.5 


Control (Path) 3 Hippo 


3.0 


Control 1 Occipital Ctx 


5.6 


AD 1 Temporal Ctx 


6.3 


Control 2 Occipital Ctx 


10.4 


AD 2 Temporal Ctx 


14.1 


Control 3 Occipital Ctx 


6.0 


AD 3 Temporal Ctx 


4.2 


Control 4 Occipital Ctx 


2.9 


AD 4 Temporal Ctx 


7.5 


Control (Path) 1 Occipital Ctx 


3.3 


AD 5 M Temporal Ctx 


8.9 


Control (Path) 2 Occipital Ctx 


0.5 


AD 5 Sup Temporal Ctx 


24.5 


Control (Path) 3 Occipital Ctx 


1.6 


AD 6 Inf Temporal Ctx 


78.5 


Control (Path) 4 Occipital Ctx 


0.4 


AD 6 Sup Temporal Ctx 


56.6 


Control 1 Parietal Ctx 


5.9 


Control 1 Temporal Ctx 


2.3 


Control 2 Parietal Ctx 


9.9 


Control 2 Temporal Ctx 


12.1 


Control 3 Parietal Ctx 


6.0 


Control 3 Temporal Ctx 


7.7 


Control (Path) 1 Parietal Ctx 


3.6 


Control 3 Temporal Ctx 


3.1 


Control (Path) 2 Parietal Ctx 


1.1 


Control (Path) 1 Temporal Ctx 


4.6 


Control (Path) 3 Parietal Ctx 


2.2 


Control (Path) 2 Temporal Ctx 


1.8 


Control (Path) 4 Parietal Ctx 


3.4 
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Table AUE. General screening panel vl.4 



Tissue Name 


Rel. 

Exp.(%) 
Ag4075, 
Run 

212696066 


Rel. 

Exp.(%) 
Ag4075, 
Run 

218525356 


issue Name 


Rel. 

Exp.(%) 
Ag4075, 
Run 

212696066 


Rel. 

Exp.(%) 
Ag4075, 
Run 

218525356 


Adipose 


0.0 J 


1.3 


Renal ca. TK-10 


9.7 


14.8 


Melanoma* 
Hs688(A).T 


14.4 


23.2 


T)i J J 

Bladder 


1.0 


1.8 


Melanoma* 
Hs688(B).T 


19.1 


29.9 


Gastric ca. (liver 
met.) NCI-N87 


41.5 


42.0 


Melanoma* M14 


9.5 


12.7 


Gastric ca. KATO 

m 




9? R 

LL.O 


Melanoma* 
LOX1MVI 


8.1 


12.9 


Colon ca. SW-948 


4.4 


5.6 


Melanoma* 
SK-MEL-5 


5.9 


14.2 


Colon ca. SW480 


100.0 


100.0 


Squamous cell 
carcinoma SCC-4 


5.1 


10.2 


Colon ca * (SW480 
met) SW620 


41.5 


50.0 


Testis Pool 


1.4 


1.9 


Colon ca. HT29 


10.2 


13.6 


Prostate ca.* (bone 
met) PC -3 


9.5 


13.6 


Colon ca. HCT-116 


13.0 


20.9 


Prostate Pool 


1.1 


1.5 


Colon ca. CaCo-2 


12.0 


14.5 


Placenta 


1.1 


1.3 


Colon cancer tissue 


5.0 


8.4 


Uterus Pool 


0.1 


0.2 


Colon ca. SW1116 


14.7 


15.9 


Ovarian ca. 
OVCAR-3 


6.5 


8.0 


Colon ca CY>1n-2n5 


74 7 




Ovarian ca. SK-OV-3 


8.1 


9.9 


Colon ca. SW-48 


3.6 


4.7 


Ovarian ca. 
OVCAR-4 


9.2 


16.4 


Colon Pool 


0.7 


i 1 

Am X 


Ovarian ca. 
OVCAR-5 


28.1 


32.1 


Small Intestine Pool 


0.5 


06 


Ovarian ca. 
IGROV-1 


23.0 


33.2 


Stomach Pool 


0 8 


0 ft 


Ovarian ca. 
OVCAR-8 


10.3 


16.4 


Bone Marrow Pool 


0.2 


0.4 


Ovary 


0.5 


0.8 


Fetal Heart 


0.1 


0.1 


Breast ca. MCF-7 


15.7 


17.2 


Heart Pool 


0.2 


0.3 


Breast ca. 
MDA-MB-231 


10.4 


15.6 


Lymph Node Pool 


1.2 


1.0 


Breast ca. BT 549 


9.9 


18.7 


Fetal Skeletal 
Muscle 


0.2 


0.2 


Breast ca. T47D ! 


53.2 


51.8 


Skeletal Muscle 
Pool 


0.2 


0.3 


Breast ca. MDA-N 


4.7 


6.3 


Spleen Pool 


0.7 


0.5 


Breast Pool 


0.6 


0.6 


Thymus Pool 


0.8 


0.9 
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Trachea 


3.6 


5.3 


PMC ranrpr^ f 

^ino cancer 

(glio/astro) 

U87-MG 


7-usoe 

20.0 


W<M? JulL. «<tit j'^ » 

20.3 


Lung 


0.1 


0.1 


CNS cancer 

(glio/astro) 

U-118-MG 


11.2 


12.9 


Fetal Lung 


2.4 


4.0 


CNS cancer 
(neuro;met) 
SK-N-AS 


6.9 


8.9 


Lung ca. NCI-N417 


1.6 


0.0 


CNS cancer (astro) 
SF-539 


9.3 


12.0 


T nncr r*n T Y_ 1 


R1 8 

OJ.O 




CNS cancer (astro) 
SNB-75 


K 1 


55.5 


Lungca. NCI-H146 


0.4 


0.8 


CNS cancer (glio) 
SNB-19 


30.1 


37.6 


T liner ra Cl-TP 77 


O.o 


5.3 


CNS cancer (glio) 
SF-295 


co a 

58.o 


60.7 


Liung ca. AJH-y 


O ft 


J.J.O 


Brain (Amygdala) 
Pool 


0.0 


0.1 


Lungca.NCI-H526 


2.1 


2.5 


Brain (cerebellum) 


0.1 


0.2 


Lungca. NCI-H23 


4.3 


4.2 


Brain (fetal) 


0.2 


0.3 


Lungca.NCI-H460 


9.2 


16.2 


Brain 

(Hippocampus) 
Pool 


0.1 


0.1 


Lung ca. HOP-62 


4.4 


4.5 


Cerebral Cortex 
Pool 


0.0 


0 1 1 

V/. 1 


Lungca.NCI-H522 


9.5 


10.0 


Brain (Substantia 
nigra) Pool 


0.1 


0.1 


Liver 


0.0 


0.1 


Brain (Thalamus) 
Pool 


0.0 


0.1 


reiai Liver 


o o 

z.y 


A 1 

4.3 


Brain (whole) 


0.2 


0.2 


Liver ca. HepG2 


6.7 


7.9 


Sninal Cord Ponl 


u2 

v/.*» 


U.J 


Kidney Pool 


1.1 


1.2 


Adrenal Gland 


0.3 


0.6 


Fetal Kidney 


0.3 


0.5 


Pituitary gland Pool 


0.1 


0.3 


Renal ca. 786-0 


5.1 


9.5 


Salivary Gland 


3.0 


2.8 


Renal ca, A498 


3.1 


5.0 


Thyroid (female) 


0.1 


o.i 


Renal ca. ACHN 


5.1 


5.9 


Pancreatic ca. 
CAPAN2 


7.9 


12.2 


Renal ca. UO-31 


2.6 


4.2 


Pancreas Pool 


1.3 


1.2 
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Table AUF. General screening panel vl.5 



Tissue Name 


Rel. 

Exp.(%) 
Ag4075, 
Run 

228714883 


Issue Name 


Rel. 

Exp.(%) 
Ag4u75, 
Run 

228714883 


Adipose 


1.0 


Renal ca. TK-10 


9,8 


Melanoma* Hs688(A).T 


18.0 


Bladder 


1.4 


Melanoma* Hs688(B).T 


17.4 


Gastric ca. (liver met.) NCI-N87 


35.4 


Melanoma* M 1 4 


9.5 


Gastric ca. KATO HI 


19.9 


Melanoma* LOXIMVI 


9.0 


Colon ca. SW-948 


4.4 


Melanoma* SK-MEL-5 


8.7 


Colon ca. SW480 


100.0 


Squamous cell carcinoma SCC-4 


5.8 


Colon ca.* (SW480 met) S W620 


32.8 


Testis Pool 


1.2 


Colon ca. HT29 


9.9 


Prostate ca.* (bone met) PC-3 


10.8 


Colon ca.HCT-1 16 


15.2 


Prostate Pool 


1.5 


Colon ca. CaCo-2 


11.1 


Placenta 


1.1 


Colon cancer tissue 


5.1 


Uterus Pool 


0.3 


Colon ca. SW1116 


7.2 


Ovarian ca. OVCAR-3 


6.2 


Colon ca. Colo-205 


23.7 


Ovarian ca. SK-OV-3 


7.5 


Colon ca. SW^I8 


3.2 


Ovarian ca. OVCAR-4 


12.5 


Colon Pool 


0.7 


Ovarian ca. OVCAR-5 


20.2 


Small Intestine Pool 


0.4 


Ovarian ca. IGROV-1 


23.8 


Stomach Pool 


0.7 


Ovarian ca. OVCAR-8 


11.2 


Bone Marrow Pool 


0.2 


Ovary 


0.6 


Fetal Heart 


0.1 


Breast ca. MCF-7 


14.4 


Heart Pool 


0.2 


Breast ca. MDA-MB-231 


14.1 


Lymph Node Pool 


0.7 


Breast ca. BT 549 


8.4 


Fetal Skeletal Muscle 


0.2 


Breast ca. T47D 


2.1 


Skeletal Muscle Pool 


0.4 


Breast ca. MDA-N 


3.6 


Spleen Pool 


0.3 


Breast Pool 


0.5 


Thymus Pool 


0.5 


Trachea 


4.6 


CNS cancer (glio/astro) U87-MG 


12.5 


Lung 


0.1 


CNS cancer (glio/astro) U-l 18-MG 


8.5 


Fetal Lung 


2.6 


CNS cancer (neuro;met) SK-N-AS 


5.5 


Lung ca. NCI-N417 


1.9 


CNS cancer (astro) SF-539 


8.4 


Lung ca. LX-1 


81.8 


CNS cancer (astro) SNB-75 


13.1 


Lung ca. NCI-H146 


0.6 


CNS cancer (glio)SNB-19 


27.2 


Lung ca. SHP-77 


7.7 


CNS cancer (glio) SF-295 


53.2 


Lung ca. A549 


11.8 


Brain (Amygdala) Pool 


0.0 


Lung ca. NCI-H526 


2.1 


Brain (cerebellum) 


0.1 


Lung ca. NCI-H23 


3.5 


Brain (fetal) 


0.2 


Lung ca. NCI-H460 


8.8 


Brain (Hippocampus) Pool 


0.0 


Lung ca. HOP-62 


3.5 


Cerebral Cortex Pool 


0.1 
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T iin<T /»q MPT 14^9? 


7 S 


Drain f^nhctflntia nicrra^ PrinT 




Liver 


00 


Brain fThalarmis'l Pf>fll 




Fetal Liver 


2.9 •. 


Brain (whole) 


0.2 


Liver ca. HepG2 


6.2 


Spinal Cord Pool 


0.1 


Kidney Pool 


0.8 


Adrenal Gland 


0.4 


Fetal Kidney 


0.3 


Pituitary gland Pool 


0.2 


Renal ca. 786-0 


5.6 


Salivary Gland 


2.7 


Renal ca. A498 


3.4 


Thyroid (female) 


0.1 


Renal ca. ACHN 


4.9 


Pancreatic ca. CAPAN2 


9.7 


Renal ca. UO-31 


2.4 


Pancreas Pool 


0.8 



Table AUG. Panel 3D 

5 



Tissue Name 


Kel. 

Ag4075, 
Run 

186579982 


Tissue Name 


ReL 

Ag4075, 
Run 

186579982 


Daoy- Medulloblastoma 


1.7 


Ca Ski- Cervical epidermoid 
carcinoma (metastasis) 


9.3 ; 


TE671- Medulloblastoma 


1.3 


ES-2- Ovarian clear cell carcinoma 


4.2 | 


D283 Med- Medulloblastoma 


13.6 


Ramos- Stimulated with 
PMA/ionomycin 6h 


12.2 


PFSK-l-Prirnitive 
Neuroectodermal 


8.0 


Ramos- Stimulated with 
PMA/ionomycin 14h 


12.2 


XF-498-CNS 


5.1 


MEG-01- Chronic myelogenous 
leukemia (megokaryoblast) 


25.0 


SNB-78- Glioma 


12.9 


Raji- Burkitt's lymphoma 


2.4 


SF-268- Glioblastoma 


5.4 


Daudi- Burkitt's lymphoma 


5.0 


T98G- Glioblastoma 


7.9 


U266- B-cell plasmacytoma 


9.3 


SK-N-SH- Neuroblastoma 
(metastasis) 


4.4 


CA46- Burkitfs lymphoma 


2.6 


SF-295- Glioblastoma 


8.2 


RL-non-Hodgkin's B-cell 
lymphoma 


6.5 


Cerebellum 


0.1 


JMl-pre-B-cell lymphoma 


6.0 


Cerebellum 


0.1 


Jurkat- T cell leukemia 


7.6 


NCI-H292- Mucoepidermoid 
lung carcinoma 


12.0 


TF-1- Erytforoleukeima 


17.6 


DMS-1 14- Small cell lung cancer 


3.0 


HUT 78- T-cell lymphoma 


4.9 


DMS-79- Small cell lung cancer 


92.0 


U937- Histiocytic lymphoma 


17.9 


NCI-H146- SmaU cell lung 
cancer 


1.6 


KU-8 12- Myelogenous leukemia 


15.4 


NCI-H526- Small cell lung 
cancer 


10.7 


769-P- Clear cell renal carcinoma 


5.8 
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NCI-N417- Small cell lung 
cancer 


3.0 


PCI 

Caki-2- Clear cell renal carcinoma 


5.5 


NCI-H82- Small cell lung cancer 


5.7 


SW 839- Clear cell renal carcinoma 


6.2 


NCI-H157- Squamous cell lung 
cancer (metastasis) 


30.1 


G401- Wilms' tumor 


3.8 


NCI-HI 155- Large cell lung 
cancer 


9.5 


Hs766T- Pancreatic carcinoma (LN 
metastasis) 


7.6 


NCI-H1299- Large cell lung 
cancer 


6.1 


CAPAN-1- Pancreatic 
adenocarcinoma (liver metastasis) 


3.3 


NCI-H727- Lung carcinoid 


8.7 


SU86.86- Pancreatic carcinoma 
(liver metastasis) 


5.1 


NCI-UMC-1 1- Lung carcinoid 


14.4 


BxPC-3- Pancreatic 
adenocarcinoma 


11.4 


LX-1- Small cell lung cancer 


100.0 


HPAC- Pancreatic adenocarcinoma 


6.1 


Colo-205- Colon cancer 


49.3 


MIA PaCa-2- Pancreatic carcinoma 


1.1 


KM12- Colon cancer 


12.7 


CFPAC-1- Pancreatic ductal 
adenocarcinoma 


10.4 


KM20L2- Colon cancer 


11.7 


PANC-1- Pancreatic epithelioid 
ductal carcinoma 


4.3 


NCI-H716- Colon cancer 


10.2 


T24- Bladder carcmma (transitional 


1.5 


SW-48- Colon adenocarcinoma 


6.7 


5637- Bladder carcinoma 


2.8 


SW1 1 16- Colon adenocarcinoma 


20.9 


HT-1 197- Bladder carcinoma 


10.4 


LS 174T- Colon adenocarcinoma 


13.4 


UM-UC-3- Bladder carcinma 
(transitional cell) 


1.4 


Sw-948- Colon adenocarcinoma 


0.9 


A204- Rhabdomyosarcoma 


2.6 


Sw-480- Colon adenocarcinoma 


3.5 


HT-1 080- Fibrosarcoma 


4.7 


NCI-SNU-5- Gastric carcinoma 


34.6 


MG-63- Osteosarcoma 


8.1 


KATO III- Gastric carcinoma 


38.7 


SK-LMS-1- Leiomyosarcoma 
(vulva) 


8.1 


NCI-SNU-16- Gastric carcinoma 


2.9 


SJRH30- Rhabdomyosarcoma (met 
to bone marrow) 


1.9 


NCI-SNU-1- Gastric carcinoma 


22.4 


A431- Epidermoid carcinoma 


10.6 


RF-1- Gastric adenocarcinoma 


1.8 


WM266-4- Melanoma 


5.5 


RF-48- Gastric adenocarcinoma 


1.9 


DU 145- Prostate carcinoma (brain 
metastasis) 


0.1 


MKN-45- Gastric carcinoma 


12.0 


MDA-MB-468- Breast 
adenocarcinoma 


13.4 


NQ-N87- Gastric carcinoma 


24.5 


SCC-4- Squamous cell carcinoma of 
tongue 


0.2 


OVCAR-5- Ovarian carcinoma 


2.3 


SCC-9- Squamous cell carcinoma of 
tongue 


0.2 


RL95-2- Uterine carcinoma 


8.3 


SCC-15- Squamous cell carcinoma 
of tongue 


0.3 


HelaS3- Cervical 
adenocarcinoma 


2.3 


CAL 27- Squamous cell carcinoma 
of tongue 


10.7 
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Table AUH. Panel 4.1D 



] 
] 

Tissue Name 


Rel. 
Exp.(% 

Run 

184565261 


i issue iiaiiie 


Rel. 

Exp.(%) 
Run 

184565261 


Secondary Thl act 


81.2 


HUVEC EL-lbeta 


35.1 


Secondary Th2 act 


84.1 


HUVEC IFN gamma 


17.6 


Secondary Trl act 


0/.5 


tlUVEv 11NJT aipria T AT AN gaJJUIia 


94 7 


Secondary Thl rest 


3.5 


ttt TV7T?r^ TTMT7 nlnKa _l IT A 

HU VxiU J JNr aipna + J±>f 




Secondary Th2 rest 


11.3 


HUVEC IL-11 


12.4 


Secondary Trl rest 


3.6 


Lung Microvascular EC none 


33.4 


Primary Thl act 


43.2 


Lung Microvascular EC TNFalpha 
•f IL-lbeta 


21.0 


Primary Th2 act 


55.1 


Microvascular Dermal EC none 


20.3 


Primary Trl act 


51.8 


Microsvasular Dermal EC 
TNFalpna + LL-Ibeta 


11.7 


Primary Thl rest 


3.3 


Bronchial epithelium iNraipna + 
ILlbeta 


39.8 


Primary Th2 rest 


2.2 


Small airway epithelium none 


1 C\ 0 


Primary Trl rest 


10.3 


Small airway epithelium TNFalpha 

< TT 1 Ua^a 

+ iL-lbeta 


15.3 


CD45RA CD4 lymphocyte act 


52.5 


Coronery artery SMC rest 


34.6 


CD45RO CD4 lymphocyte act 


45.7 


Coronery artery SMC TNFalpha + 
IL-loeta 


32.5 


CD8 lymphocvte act 


51.1 


Astrocytes rest 


in o 


Secondary CD8 lymphocyte rest 


41.5 


Astrocytes iJNraipna + iL»-iDeia 


7 1 


Secondary CDS lymphocyte act 


36.1 


VTT CIO /"Rnesvnhin roct 

Jvu-oiz \ijasopniij rest 


n i 

JZ.X 


CD4 lymphocyte none 


0.6 


KU-OiZ {BaSOpiui) 

PMA/ionomyc in 


82.4 


?rv Th 1 /Th2/Trl anti-CD95 
CHll 


4.0 


CCD1 106 (Keratinocytes) none 


52.9 


LAK cells rest 


24.1 


CCD1106 (Keratinocytes) 
TNFalpha + IL-lbeta 




LAK cells IL-2 


34.6 


Liver cirrhosis 


2.8 


LAK cells IL-2+IL-12 


28.3 


NCI-H292 none 


27.0 


LAK cells IL-2+IFN gamma 


20.4 


NCI-H292 BL-4 


53.6 


LAK cells IL-2+ IL-18 


29.5 


NCI-H292 IL-9 


29.5 


LAK cells PMA/ionomycin 


49.0 


NCI-H292 EL-13 


51.4 


NK Cells TL-2 rest 


43.2 


NCI-H292 IFN gamma 


58.6 


Two Way MLR 3 day 


22.4 


HPAEC none 


10.4 


Two Way MLR 5 day 


39.8 


HPAEC TNF alpha + IL-1 beta 


17.0 


Two Way MLR 7 day 


25.9 


Lung fibroblast none 


42.0 


PBMC rest 


2.3 


Lung fibroblast TNF alpha + IL-1 
beta 


17.7 
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PBMCPWM 


42.3 


Lung fibroblast IL-4 


^373 


PBMC PHA-L 


30.1 


Lung fibroblast IL-9 


38.4 


Ramos (B cell) none 


57.4 


Lung fibroblast IL-13 


41.2 


Ramos (B cell) ionomycin 


100.0 


Lung fibroblast DFN gamma 


39.5 


B lymphocytes PWM 


31.2 


Dermal fibroblast CCD1070 rest 


84.7 


B lymphocytes CD40L and DL-4 


14.5 


Dermal fibroblast CCD1070 TNF 
alpha 


59.0 


EOL-1 dbcAMP 


61.1 


Dermal fibroblast CCD 1070 IL-1 
beta 


55.1 


EOL-1 dbcAMP 
PM A/ion omycin 


21.2 


Dermal fibroblast IFN gamma 


16.7 


Dendritic cells none 


28.5 


Dermal fibroblast EL-4 


36.9 


Dendritic cells LPS 


7.9 


Thermal FihroWla^N r**^t 


1J.U 


Dendritic cells anti-CD40 


32.8 


Neutrophils TNFa+LPS 


1.6 


Monocytes rest 


11.0 


Neutrophils rest 


0.4 


Monocytes LPS 


5.4 


Colon 


4.5 


Macrophages rest 


25.5 


Lung 


7.5 1 


Macrophages LPS 


3.7 


Thymus 


6.3 


HUVEC none 


21.9 


Kidney 


12.9 


HUVEC starved 


27.7 







Table AUI. Panel S Islet 

5 



Tissue Name 


ReL 

Exp.(%) 

Ag4075 

Run 

186511155 


Tissue Name 


ReL 

Exp.(%) 
Ag4075, 
Run 

186511155 


97457_Patient-02go_adipose 


7.6 


94709 _Donor 2 AM - A_adipose 


45.7 


97476JPatient-07sk w skeletal 
muscle 


2.9 


94710_Donor 2 AM - B_adipose 


27.4 


97477 JPatient-€7ut_uterus 


3.5 


9471 lJ)onor 2 AM - C_adipose 


15.2 


97478JPatient-07pLplacenta 


5.0 


94712J)onor 2 AD « A_adipose 


62.9 


99 167JBayer Patient 1 


30.6 


94713_Donor 2 AD - B^adipose 


66.4 


97482_Patient-08ut_uterus 


4.6 


94714J>onor 2 AD - C_adipose 


57.4 


97483 JPatient-08pLplacenta 


3.8 


94742 J>onor 3 U - A_Mesenchymal 
Stem Cells 


36.1 


97486JPatient-09sk_skeletal 
muscle 


0.3 


94743_Donor 3 U - B_Mesenchymal 
Stem Cells 


62.4 


97487J > atiemv09ut_uterus 


8.3 


94730_Donor 3 AM - A_adipose 


34.9 


97488 J > atient-09pl_placenta 


3.4 


94731_Donor 3 AM - B_adipose 


17.2 


97492 JPatient-10ut_uterus 


7.5 


94732_Donor 3 AM - C_adipose 


22.4 


97493 ^Patient-lOpLplacenta 


5.1 


94733 JDonor 3 AD - A_adipose 


100.0 


97495_Patient-l lgo_adipose 


6.4 


94734JDonor 3 AD - B^adipose 


32.3 
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97496_Patient-l lsleskeletal 
muscle 


1.3 


PCT^U-S-EHP-Zi 

■ w U r tar WJ 1 . W Imv*. «» 

94735 JDonor 3 AD - C_adipose 


«4' o4J«v tntm} *m ' 

66.9 


97497 JPatient-1 lut_uterus 


11.6 


77 1 38_J-i ver _JIepG2untreated 


31.4 


97498_Patient-l lpLpIacenta 


3.9 


73556_Heart_Cardiac stromal cells 
(primary) 


3.6 


97500 JPatient-12go_adipose 


8.5 


81735_Small Intestine 


6.4 


97501JPatient42sk_skeletal 
muscle 


9 7 


72409_Kidney„Proximal Convoluted 
Tubule 


3 8 


97502JPatient-12ut_uterus 


8.7 


82685_Small intestineJDuodenum 


1.9 


97503_Patient-12pLplacenta 


3.1 


90650_Adrena]_Adrenocortical 
adenoma 


1.4 


94721JDonor2U- 
A_Mesenchymal Stem Cells 


40.1 


72410JKidneyJHRCE 


14.9 


94722J)onor2U- 
B_Mesenchymal Stem Cells 


23.7 


72411_KidneyJHRE 


11.1 


94723_Donor2U- 
QMesenchymal Stem Cells 


52.5 


73139_Uterus_Uterine smooth 
muscle cells 


17.4 



Table AU.T. Panel 5P 

5 



Tissue Name 


Rel. 

Exp.(%) 

Ag378, 

Rub 

1702226 

81 


Rel. 

Exp.(%) 

Ag4075, 

Run 

1721678 

23 


Tissue Name 


Rel. 

Exp.(%) 
Ag3788, 
Run 

17022268 
1 


Rel. 

Exp.(%) 
Ag4075, 
Run 

17216782 
3 


97457 w Patient-02go_adipos 
e 


8.2 j 


11.0 


94709JDonor2AM- 
A_adipose 


44.1 


53.2 


97476_Patient-07sk_skeleta 
1 muscle 


2.1 j 


2.8 


94710J)onor2AM- 
B_adipose 


31.2 


28.3 


97477 J?atient-07ut_uterus 


3.5 


7.1 


94711_Donor2AM- S 
C_adipose | 


29.3 


30.8 


97478J > atient-07pLplacent 
a 


5.1 


5.8 


94712_Donor2AD- 
A_adipose 


77.4 


81.8 


97481J>atient-08sk_skeleta 
1 muscle 


4.2 


3.9 


94713_Donor2AD- 
B_adipose 


100.0 


100.0 


97482_Patient-08ut„uterus 


5.7 


8.7 


94714_Donor2AD- 
C_adipose 


68.8 


84.1 


97483_Patient-08pLplacent 
a 


7.5 


7.2 


94742_Donor3U- 
AJMesenchymal Stem Cells 


55.1 


66.9 


97486_Patient-09sk.skeleta 
1 muscle 


0.9 


1.2 


94743_Donor3U- 
B_Mesenchymal Stem Cells 


62.9 


70.7 


97487„Patient-09ut_uterus 


8.5 


11.0 


94730_Donor 3 AM - 
A_adipose 


41.5 


46.7 


97488„Patient-09pLplacent 
a 


4.9 


4.2 


94731JDonor3AM- 
B__adipose 


29.7 


29.5 
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97492JPatient-10uunerus 


5.7 


5.8 


94732 JDonof 3 AM- 
C_adijpose 


pur / rl 
25.7 


3L3J 1 z 
36.6 


97493 Jatient-lOpUriacent 
a 


6.3 


7.0 


94733_Donor3AD- 
A_adipose 


97.3 


92.7 


97495 J>atient-1 1 go_adipos 
e : 


7.3 


8.8 j 


94734_Donor3AD- 
B_adipose 


58.6 


80.7 


97496JPatient-l lsk_skeleta 
1 muscle 


1.7 


1.3 


94735_Donor3AD- 
C_adipose 


69.3 


83.5 


97497 J>atient-1 lut.uterus 


12.9 


15.7 


77 138_Liver_HepG2untreate 
d 


72.7 


80.7 


97498_Patient-l lpLplacent 
a 


4.7 


6.8 


73556_Heart_Cardiac stromal 
cells (primary) 


2.6 • 


4.7 


97500_Patient-12go_adipos 

e 


9.5 


12.6 


81735_Small Intestine 


7.6 


8.9 


975 0 1 _Patient- 1 2sk_skeleta 
1 muscle 


2.7 


2.4 


72409 JKidneyJProximal 
Convoluted Tubule 


4*6 


4.3 


97502 J>atient-12uUiterus 


9.3 


10.7 


82685_Small 
intestineJDuodenum 


1.9 


2.0 


97503 JPatient-12pLplacent 
a 


3.0 


3.1 


90650_Adrenal_Adrenocortic 
al adenoma 


1.4 


1.1 


94721 Donor 2 U- 
A_Meseochymal Stem 
Cells 


50.3 


52.9 


72410_Kidney_HRCE 


21.9 


21.5 


94722 J>onor2U- 
B Jvlesenchymal Stem 
Cells 


45.4 


473 


72411_KidneyJiRE 


15.7 


0.0 


94723 J>onor2U- 
C_Mesenchymal Stem 
Cells 


52.1 


45.4 


73139_Uterus - _Uterine 
smooth muscle cells 


23.7 


28.3 



Table AUK, general oncology screening panel v 2.4 

5 



Tissue Name 


Rel. 

Exp.(%) 
Ag4075, 
Run 

259745203 


Tissue Nine 


Rel. 

Exp.(%) 
Ag4075, 
Run 

259745203 


Colon cancer 1 


50.7 


Bladder cancer NAT 2 


0.1 


Colon cancer NAT 1 


13.5 


Bladder cancer NAT 3 


0.0 


Colon cancer 2 


47.0 


Bladder cancer NAT 4 


0.1 


Colon cancer NAT 2 


24.3 


Prostate adenocarcinoma 1 


33.9 


Colon cancer 3 


95.9 


Prostate adenocarcinoma 2 


3.6 


Colon cancer NAT 3 


16.2 


Prostate adenocarcinoma 3 


26.4 


Colon malignant cancer 4 


55.9 


Prostate adenocarcinoma 4 


100.0 


Colon normal adjacent tissue 4 


6.2 


Prostate cancer NAT 5 


6.8 


Lung cancer 1 


11.4 


Prostate adenocarcinoma 6 


11.2 
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Lung NAT 1 


0.6 


Prostate JBfi^9° a - 


g qI* nJL h 1 


Lung cancer 2 


12.9 


Prostate adenocarcinoma 8 


2.6 


Lung NAT 2 


1.0 


Prostate adenocarcinoma 9 


38.2 


Squamous cell carcinoma 3 


62.0 


Prostate cancer NAT 10 


0.6 


Lung NAT 3 


1.1 


Kidney cancer 1 


7.9 


metastatic melanoma 1 


20.2 


KidnevNAT 1 


2 9 


Melanoma 2 


3.1 


Kidney cancer 2 


28.1 


Melanoma 3 


1.7 


Kidney NAT 2 


8.5 


metastatic melanoma 4 


57.0 


Kidney cancer 3 


13.9 


metastatic melanoma 5 


25.3 


Kidney NAT 3 


2.1 


Bladder cancer 1 


0.2 


Kidney cancer 4 


9.6 


Bladder cancer NAT 1 


0.0 


Kidney NAT 4 


11.2 


Bladder cancer 2 


11.7 







AI_comprehensive panel_vl.0 Summary: Ag4075 Highest expression is seen in 
an osteoarthritic bone sample (CT=27.31). This gene is expressed at moderate to low levels 
5 in many samples on this panel. Please see Panel 4. 1 for discussion of this gene in 
inflammation. 

CNS_neurodegeneration_vl.O Summary: Ag4075 TTiis panel does not show 
differential expression of this gene in Alzheimer's disease. However, this profile confirms 
the expression of this gene at moderate levels in the brain. Please see Panel 1 .4 for 

10 discussion of this gene in the central nervous system. 

GeneraLscreenin£_panei_vl.4 Summary: Ag4075 Two experiments with the 
same probe and primer set produce results that are in excellent agreement. Highest 
expression is seen in a colon cancer cell line (CTs=21-22). Overall, expression of this gene 
appears to be highly associated with cancer cell line samples, with high levels oof 

15 expression in brain, colon, gastric, lung, breast, ovarian, and melanoma cancer cell lines. 
This expression profile suggests a role for this gene product in cell survival and 
proliferation. This gene encodes a protein with homology to Neutral amino acid transporter 
2. L type amino acid transporter 1 (LAT1) has been implicated in tumor growth and may 
play an important role in supplying nutrition to cells for cell proliferation (Ohkame, J Surg 

20 Oncol 2001 Dec;78(4):265-71; discussion 271-2). Thus, modulation of this gene product 
may be useful in the treatment of cancer. 

Among tissues with metabolic function, this gene is expressed at moderate levels in 
pituitary, adipose, adrenal gland, pancreas, thyroid, and adult and fetal skeletal muscle, 
heart, and liver. This widespread expression among these tissues suggests that this gene 
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product may play a role in norma] neuroendocrine and dlalb^ 7 3 

disregulated expression of this gene may contribute to neuroendocrine disorders or 
metabolic diseases, such as obesity and diabetes. 

This gene is also expressed at moderate levels in the CNS, including the 
5 hippocampus, thalamus, substantia nigra, amygdala, cerebellum and cerebral cortex. 

Therefore, therapeutic modulation of the expression or function of this gene may be useful 
in the treatment of neurologic disorders, such as Alzheimer's disease, Parkinson's disease, 
schizophrenia, multiple sclerosis, stroke and epilepsy. 

In addition, this gene is expressed at much higher levels in fetal lung and liver tissue 
10 (CTs=26-27) when compared to expression in the adult counterparts (CTs=31-33). Thus, 
expression of this gene may be used to differentiate between the fetal and adult sources of 
these tissues. 

General jscreenfag_panel_vl.5 Summary: Ag4075 Highest expression is seen in 
a colon cancer cell line (CT=20), with expression in this panel in strong agreement with 
15 Panel 1 .4. Please see that panel for discussion of this gene in disease. 

Panel 3D Summary: Ag4075 Expression of this gene is widespread on this panel, 
with highest expression in a lung cancer cell line (CT=26). The widespread expression on 
this panel is in agreement with expression in Panels 1.4 and 1.5 where expression of this 
gene is highly associated with cancer cell line samples. Please see Panel 1.4 for discussion 
20 of this gene in oncology. 

Panel 4.1D Summary: Ag4075 Highest expression of this gene is seen in a sample 
derived from the Ramos B cell line treated with ionomycin (CT=27.3). In addition, this 
gene appears to be more highly expressed in activated T cells than in resting T cells. Thus, 
therapeutic regulation of the transcript or the protein encoded by the transcript could be 

25 important in immune modulation and in the treatment of T cell-mediated diseases such as 
asthma, arthritis, psoriasis, IBD, and lupus. In addition, this gene is also expressed at 
moderate levels in a wide range of cell types of significance in the immune response in 
health and disease. These cells include members of the T-cell, B-cell, endothelial cell, 
macrophage/monocyte, and peripheral blood mononuclear cell family, as well as epithelial 

30 and fibroblast cell types from lung and skin, and normal tissues represented by colon, lung, 
thymus and kidney. This ubiquitous pattern of expression suggests that this gene product 
may be involved in homeostatic processes for these and other cell types and tissues. This 
pattern is in agreement with the expression profile in GeneraLscreening_panel_vl.4 and 
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also suggests a role for the gene product in cell survival liuf^^ *^ ^ * 

modulation of the gene product with a functional therapeutic may lead to the alteration of 
functions associated with these cell types and lead to improvement of the symptoms of 
patients suffering from autoimmune and inflammatory diseases such as asthma, allergies, 
5 inflammatory bowel disease, lupus erythematosus, psoriasis, rheumatoid arthritis, and 
osteoarthritis. 

Panel 5 Islet Summary: Ag4075 Highest expression is seen in adipose (CT=27). 
In addition, this expression of this gene is widespread on this panel, with moderate to high 
levels in metabolic tissues, including skeletal muscle, adipose, pancreatic islet cells and 

10 placenta. This gene codes for neutral amino acid transporter B(0)[ATB(0)]. ATB(O) 
transports the gluconeogenic amino acids 1-alanine and 1-glutamine into cells. Excess 
neutral amino acid transport and a resultant increase in gluconeogenesis and triglyceride 
synthesis may impair beta cell function in obesity and Type 2 diabetes. Pharmacologic 
inhibition of ATB(0) encoded by this gene may prevent or treat the symptoms of 

15 obesity-related Type 2 diabetes. 

Panel 5D Summary; Ag4075 Expression on this panel agrees with Panel 51. 
Highest expression is seen in adipose in two replicate experiments (CTs=28). Please see 
Panel 51 and 1.4 for further discussion of utility of this gene in metabolic disease. 

genera] oncology screening panel_v_2.4 Summary: Ag4975 Highest expression 
20 of this gene is seen in prostate cancer (CT=27). Prominent expression is also seen in 

melanoma and squamous cell carcinoma derived samples. In addition, this gene appears to 
be overexpressed in colon, lung, prostate cancer when compared to expression in the 
normal adjacent tissue. Thus, expression of this gene could be used as a marker to detect 
the presence of colon, lung and prostate cancer. Furthermore, therapeutic modulation of the 
25 expression or function of this gene may be effective in the treatment of colon, prostate, 
melanoma and lung cancer. 



Example D: Identification of Single Nucleotide Polymorphisms in NOVX 
nucleic acid sequences 
30 Variant sequences are also included in this application. A variant sequence can 

include a single nucleotide polymorphism (SNP). A SNP can, in some instances, be 
referred to as a "cSNP" to denote that the nucleotide sequence containing the SNP 
originates as a cDNA. A SNP can arise in several ways. For example, a SNP may be due to 
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a substitution of one nucleotide for another at the polymd^Src TrfeM^ff^^^ifi|<^^aTi^ 
be either a transition or a transversion. A SNP can also arise from a deletion of a 
nucleotide or an insertion of a nucleotide, relative to a reference allele. In this case, the 
polymorphic site is a site at which one allele bears a gap with respect to a particular 
5 nucleotide in another allele. SNPs occurring within genes may result in an alteration of the 
amino acid encoded by the gene at the position of the SNP. Intragenic SNPs may also be 
silent, when a codon including a SNP encodes the same amino acid as a result of the 
redundancy of the genetic code. SNPs occurring outside the region of a gene, or in an 
intron within a gene, do not result in changes in any amino acid sequence of a protein but 

10 may result in altered regulation of the expression pattern. Examples include alteration in 
temporal expression, physiological response regulation, cell type expression regulation, 
intensity of expression, and stability of transcribed message. 

SeqCalling assemblies produced by the exon linking process were selected and 
extended using the following criteria. Genomic clones having regions with 98% identity to 

15 all or part of the initial or extended sequence were identified by BLASTN searches using 
the relevant sequence to query human genomic databases. The genomic clones that 
resulted were selected for further analysis because this identity indicates that these clones 
contain the genomic locus for these SeqCalling assemblies. These sequences were 
analyzed for putative coding regions as well as for similarity to the known DNA and 

20 protein sequences. Programs used for these analyses include Grail, Genscan, BLAST, 
HMMER, FASTA, Hybrid and other relevant programs. 

Some additional genomic regions may have also been identified because selected 
SeqCalling assemblies map to those regions. Such SeqCalling sequences may have 
overlapped with regions defined by homology or exon prediction. They may also be 

25 included because the location of the fragment was in the vicinity of genomic regions 

identified by similarity or exon prediction that had been included in the original predicted 
sequence. The sequence so identified was manually assembled and then may have been 
extended using one or more additional sequences taken from CuraGen Corporation's human 
SeqCalling database. SeqCalling fragments suitable for inclusion were identified by the 

30 CuraTools™ program SeqExtend or by identifying SeqCalling fragments mapping to the 
appropriate regions of the genomic clones analyzed. 

The regions defined by the procedures described above were then manually 
integrated and corrected for apparent inconsistencies that may have arisen, for example, 
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from miscalled bases in the original fragments or from dilcleplicfesr'ISMIw fir^$e$ y " 4 
exon junctions, EST locations and regions of sequence similarity, to derive the final 
sequence disclosed herein. When necessary, the process to identify and analyze SeqCalling 
assemblies and genomic clones was reiterated to derive the full length sequence (Alderborn 
5 et al. , Determination of Single Nucleotide Polymorphisms by Real-time Pyrophosphate 
DNA Sequencing. Genome Research. 10 (8) 1249-1265, 2000). 

Variants are reported individually but any combination of all or a select subset of 
variants are also included as contemplated NOVX embodiments of the invention. 

10 NOVlaSNPData: 

NOVla has one SNP variant, whose variant positions for its nucleotide and amino 
acid sequences is numbered according to SEQ ED NOs:l and 2, respectively. The 
nucleotide sequence of the NOVla variant differs as shown in Table SNP1. 



15 Table SNP1. 



Variant 


Nucleotides 


Amino Adds 


Position 


Initial 


Modified 


Position 


Initial 


Modified 


13375555 


4319 


C 


T 


1440 


Pro 


Leu 



20 

NOV2b SNP Data: 

NOV2b has six SNP variants, whose variant positions for its nucleotide and amino 
acid sequences are numbered according to SEQ ID NOs:17 and 18, respectively. The 
nucleotide sequence of the NOV2b variant differs as shown in Table SNP2. 

25 

Table SNP2. 



Variant 


Nucleotides 


Amino Acids 


Position 


Initial 


Modified 


Position 


Initial 


Modified 


12252060 


100 


A 


T 


34 


He 


Phe 


13380837 


204 


A 


C 


68 


Thr 


Thr 
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13380838 


209 


G 


A 


1 PCT./I 
70 


UOOBi 
Gly 


Asp 


3! 


13380839 


254 


A 


G 


85 


Gin 


Arg 




13380843 


605 


C 


T 


202 


Ala 


Val 




13380844 


614 


C 


T 


205 


Ala 


Val 





NOV3bSNPData: 

NOV3b has seven SNP variants, whose variant positions for its nucleotide and 
amino acid sequences are numbered according to SEQ ID NOs:21 and 22, respectively. 
The nucleotide sequence of the NOV3b variant differs as shown in Table SNP3. 

Table SNP3. 



Variant 


Nucleotides 


Amino Acids 


Position 


Initial 


Modified 


Position 


Initial 


Modified 


13375856 


338 


G 


A 


0 






13380855 


397 


T 


G 


0 






13380857 


1134 


T 


C 


243 


Val 


Ala 


13375853 


1362 


G 


A 


319 


Arg 


His 


13380859 


1376 


A 


G 


324 


Thr 


Ala 


13380860 


1426 


C 


T 


340 


Cys 


Cys 


13380861 


1496 


C 


T 


0 







NOV4b SNP Data: 

NOV4b has eleven SNP variants, whose variant positions for its nucleotide and 
amino acid sequences are numbered according to SEQ ID NOs:27 and 28, respectively. 
The nucleotide sequence of the NOV4b variant differs as shown in Table SNP4. 

Table SNP4. 



Variant I Nucleotides Amino Acids 
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Position 


Initial 


Modified 


— FCTVI 
Position 


Initial 


Modified 


13380847 


73 


G 


C 


12 


Arg 


Pro 


13380848 


116 


G 


A 


26 


Arg 


Arg 


13380849 


117 


A 


T 


27 


lie 


Phe 


13380862 


200 


G 


T 


54 


Lys 


Asn 


13380863 ! 


222 


G 


T 


62 


Glvi 


End 


13380864 


243 


G 


T 


69 


LrlU 


bna 


13380850 


338 


C 


T 


100 


He 


Be 


13380851 


438 


G 


T 


134 


Ala 


Ser 


13380865 


779 


A 


T 


247 


Pro 


Pro 


13380852 


1023 


C 


G 


329 


Pro 


Ala 


13380853 


1494 


C 


T 


0 







NOV6a SNP Data: 

5 NOV6a has two SNP variants, whose variant positions for its nucleotide and amino 

acid sequences are numbered according to SEQ ID NOs:33 and 34, respectively. The 
nucleotide sequence of the NOV6a variant differs as shown in Table SNP5. 

Table SNP5. 



Variant 


Nucleotides 


Amino Acids 


Position 


Initial 


Modified 


Position 


Initial 


Modified 


13380868 


1646 


T 


C 


539 


Val 


Ala 


13380869 


2992 


T 


C 


988 


Cys 


Arg 



15 NOVlla SNP Data: 

NOV1 la has one SNP variant, whose variant positions for its nucleotide and amino 
acid sequences is numbered according to SEQ ID NOs:47 and 48, respectively. The 
nucleotide sequence of the NOVlla variant differs as shown in Table SNP6. 



20 



Table SNP6. 
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Variant 


Nucleotides 


Amino Acids 


Position 


Initial 


Modified 


Position 


Initial 


Modified 


13380962 


41 


G 


T 


0 







5 

NOV12aSNPData: 

NOV12a has three SNP variants, whose variant positions for its nucleotide and 
amino acid sequences are numbered according to SEQ ID NOs:63 and 64, respectively. 
The nucleotide sequence of the NOV 12a variant differs as shown in Table SNP7. 

10 

Table SNP7. 



Variant 


Nucleotides 


Amino Acids 


Position 


Initial 


Modified 


Position 


Initial 


Modified 


13380902 


594 


C 


T 


193 


Ser 


Ser 


13380901 


1392 


A 


G 


0 






13380900 


1425 


C 


T 


0 







15 

NOV13a SNP Data: 

NOV13a has one SNP variant, whose variant positions for its nucleotide and amino 
acid sequences is numbered according to SEQ ID NOs:65 and 66, respectively. The 
20 nucleotide sequence of the NOV13a variant differs as shown in Table SNP8. 

Table SNP8. 



Variant 


Nucleotides 


Amino Acids 


Position 


Initial 


Modified 


Position 


Initial 


Modified 


13380964 


204 


C 


T 


68 


Leu 


Leu 



25 
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NOV14a SNP Data: 

NOV14a has two SNP variants, whose variant positions for its nucleotide and 
amino acid sequences are numbered according to SEQ ID NOs:73 and 74, respectively. 
The nucleotide sequence of the NOV14a variant differs as shown in Table SNP9. 

5 

Table SNP9. 



Variant 


Nucleotides 


Amino Acids 


Position 


Initial 


Modified 


Position 


Initial 


Modified 


13380922 


106 


C 


G 


28 


Pro 


Pro 


13380923 


760 


A 


G 


246 


Pro 


Pro 



10 

NOV15a SNP Data: 

NOV15a has two SNP variants, whose variant positions for its nucleotide and 
15 amino acid sequences are numbered according to SEQ ID NOs:77 and 78, respectively. 
The nucleotide sequence of the NOV15a variant differs as shown in Table SNP10. 

Table SNP10. 

20 



Variant 


Nucleotides 


Amino Acids 


Position 


Initial 


Modified 


Position 


Initial 


Modified 


13380896 


19 


T 


C 


4 


Phe 


Leu 


13380897 


258 


G 


A 


83 


Pro 


Pro 



NOV20a SNP Data: 

25 

NOV20a has seven SNP variants, whose variant positions for its nucleotide and 
amino acid sequences are numbered according to SEQ ID NOs:107 and 108, respectively. 
The nucleotide sequence of the NOV20a variant differs as shown in Table SNP11. 
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Table SNP11. 



Variant 


Nucleotides 


Amino Acids 


Position 


Initial 


Modified 


Position 


Initial 


Modified 


13380969 


155 


G 


A 


0 






13380970 


448 


A 


G 


79 


His 


Arg 


13380971 


475 


G 


C 


88 


Cys 


Ser 


13380972 


780 


A 


G 


190 


Arg 


Gly j 


13380974 " 


890' 


A 


G 


226 


Arg 


Arg 


13380975 


1798 


A 


G 


0 






13380976 


2564 


A 


G 


0 







5 

NOV26a SNP Data: 

NOV26a has one SNP variant, whose variant positions for its nucleotide and amino 
10 acid sequences i s numbered according to SEQ ID NOs: 119 and 1 20, respectively. The 
nucleotide sequence of the NOV26a variant differs as shown in Table SNP12. 

Table SNP12. 

15 



Variant 


Nucleotides 


Amino Acids 


Position 


Initial 


Modified 


Position 


Initial 


Modified 


13377803 


98 


G 


A 


25 


Met 


lie 



NOV27a SNP Data: 

20 

NOV27a has two SNP variants, whose variant positions for its nucleotide and 
amino acid sequences are numbered according to SEQ ID NOs:121 and 122, respectively. 
The nucleotide sequence of the NOV27a variant differs as shown in Table SNP13. 
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Table SNP13. 



Variant 


Nucleotides 


Amino Acids 


Position 


Initial 


Modified 


Position 


Initial 


Modified 


13380980 


186 


A 


G 


22 


Thr 


Ala 


13380979 


292 


C 


T 


57 


Thr 


He 



NOV28aSNPData: 

NOV28a has two SNP variants, whose variant positions for its nucleotide and 
10 ' amino acid sequences are numbered according to SEQ ID NOs:123 and 124, respectively. 
The nucleotide sequence of the NOV28a variant differs as shown in Table SNP14. 

Table SNP14. 

15 



Variant 


Nucleotides 


Amino Acids 


Position 


Initial 


Modified 


Position 


Initial 


Modified 


13380981 


2192 


G 


A 


721 


Arg 


Lys 


13380982 


2283 


C 


T 


751 


Phe 


Phe 



NOV29a SNP Data: 

20 

NOV29a has one SNP variant, whose variant positions for its nucleotide and amino 
acid sequences is numbered according to SEQ ID NOs:127 and 128, respectively. The 
nucleotide sequence of the NOV29a variant differs as shown in Table SNP15. 

25 Table SNP15. 



Variant 


Nucleotides 


Amino Acids 


Position 


Initial 


Modified 


Position 


Initial 


Modified 
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rex: 



US OB,- 1 3±3S 



B 



13380985 



46 



» T l c T 



NOV31aSNPData: 

NOV3 1 a has one SNP variant, whose variant positions for its nucleotide and amino 
acid sequences is numbered according to SEQ ID NOs:133 and 134, respectively. The 
nucleotide sequence of the NOV31a variant differs as shown in Table SNP16. 



10 



Table SNP16. 



Variant 


Nucleotides 


Amino Acids 


Position 


Initial 


Modified 


Position 


Initial 


Modified 


13380984 ! 


1232 


G 


A 


335 


Gly 


Ser 



15 



NOV34a SNP Data: 



NOV34a has two SNP variants, whose variant positions for its nucleotide and 
amino acid sequences are numbered according to SEQ ID NOs:141 and 142, respectively. 
20 The nucleotide sequence of the NOV34a variant differs as shown in Table SNP17. 

Table SNP17. 



Variant 


Nucleotides 


Amino Adds 


Position 


Initial 


Modified 


Position 


Initial 


Modified 


13380987 


1145 


G 


C 


362 


Arg 


Thr 


13380988 


1749 


A 


T 


0 







25 



NOV35a SNP Data: 
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NOV35a has one SNP variant, whose variant posil^V'^ 3 
acid sequences is numbered according to SEQ ID NOs:143 and 144, respectively. The 
nucleotide sequence of the NOV35a variant differs as shown in Table SNP18. 

5 Table SNP18. 



Variant 


Nucleotides 


Amino Acids 


Position 


Initial 


Modified 


Position 


Initial 


Modified 


13380995 


85 


C 


T 


22 


Thr 


De 



10 

NOV36a SNP Data: 

NOV36a has three SNP variants, whose variant positions for its nucleotide and 
amino acid sequences are numbered according to SEQ ED NOs:153 and 154, respectively. 
15 The nucleotide sequence of the NOV36a variant differs as shown in Table SNP19. 

Table SNP19. 



Variant 


Nucleotides 


Amino Adds 


Position 


Initial 


Modified 


Position 


Initial 


Modified 


13380998 


411 


G 


A 


122 


Ser 


Asn 


13381013 


492 


T 


C 


149 


Leu 


Pro 


13380999 


686 


T 


C 


214 


Cys 


Arg 



20 

NOV37a SNP Data: 

25 NOV37a has one SNP variant, whose variant positions for its nucleotide and amino 

acid sequences is numbered according to SEQ ID NOs:155 and 156, respectively. The 
nucleotide sequence of the NOV37a variant differs as shown in Table SNP20. 

564 
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Table SNP20. 



Variant 


Nucleotides 


Amino Acids 


Position 


Initial 


Modified 


Position 


Initial 


Modified 


13381009 


2077 


C 


G 


0 







NOV38aSNPData: 

NOV38a has one SNP variant, whose variant positions for its nucleotide and amino 
10 acid sequences is numbered according to SEQ ID NOs:157 and 158, respectively. The 
nucleotide sequence of the NOV38a variant differs as shown in Table SNP21. 

Table SNP21. 

15 



Variant 


Nucleotides 


Amino Acids 


Position 


Initial 


Modified 


Position 


Initial 


Modified 


13378369 


994 


C 


T 


330 


Ser 


Leu 



NOV40a SNP Data: 

20 

NOV40a has one SNP variant, whose variant positions for its nucleotide and amino 
acid sequences is numbered according to SEQ ID NOs:167 and 168, respectively. The 
nucleotide sequence of the NOV40a variant differs as shown in Table SNP22. 

25 Table SNP22. 



Variant 


Nucleotides 


Amino Acids 


Position 


Initial 


Modified 


Position 


Initial 


Modified 


13381011 


32 


A 


G 


0 
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NOV41aSNPData: 

NOV41a has two SNP variants, whose variant positions for its nucleotide and 
amino acid sequences are numbered according to SEQ ID NOs:173 and 174, respectively. 
The nucleotide sequence of the NOV41a variant differs as shown in Table SNP23. 

Table SNP23. 



Variant 


Nucleotides 


Amino Acids 


Position 


Initial 


Modified 


Position 


Initial 


Modified 


13380997 


247 


A 


G 


55 


Asn 


Asp 


13380996 


417 


A 


G 


111 


Lys 


Lys 



NOV43a SNP Data: 

NOV43a has eight SNP variants, whose variant positions for its nucleotide and 
amino acid sequences are numbered according to SEQ ID NOs:181 and 182, respectively. 
The nucleotide sequence of the NOV43a variant differs as shown in Table SNP24. 

Table SNP24. 



Variant 


Nucleotides 


Amino Acids 


Position 


Initial 


Modified 


Position 


Initial 


Modified 


13381140 


184 


G 


A 


61 


Asp 


Asn 


13381141 


337 


T 


C 


112 


Phe 


Leu 


13381158 


729 


G 


T 


242 


Met 


De 


13381157 


748 


A 


G 


249 


Ser 


Gly 


13381156 


934 


T 


C 


311 


Phe 


Leu 


13381142 


1916 


A 


G 


0 






13381143 


2123 


T 


A 


0 
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13381148 


2260 












3 




c 


j Q P'CT 





NOV44aSNPData: 

5 

NOV44a has one SNP variant, whose variant positions for its nucleotide and amino 
acid sequences is numbered according to SEQ ID NOs:183 and 184, respectively. The 
nucleotide sequence of the NOV44a variant differs as shown in Table SNP25. 

10 Table SNP25. 



Variant 


Nucleotides 


Amino Acids 


Position 


Initial 


Modified 


Position 


Initial 


Modified 


13381168 


1096 


C 


T 


0 







NOV45a SNP Data: 

NOV45a has two SNP variants, whose variant positions for its nucleotide and 
amino acid sequences are numbered according to SEQ ID NOs:185 and 186, respectively. 
20 The nucleotide sequence of the NOV45a variant differs as shown in Table SNP26. 

Table SNP26. 



Variant 


Nucleotides 


Amino Acids 


Position 


Initial 


Modified 


Position 


Initial 


Modified 


13381163 


1269 


T 


C 


399 


Cys 


Arg 


13381162 


1418 


C 


T 


0 







25 

NOV46a SNP Data: 
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NOV46a has one SNP variant, whose variant pos&oQTW^ 3 
acid sequences is numbered according to SEQ ID NOs:187 and 188, respectively. The 
nucleotide sequence of the NOV46a variant differs as shown in Table SNP27. 

5 Table SNP27. 



Variant 


Nucleotides 


Amino Acids 


Position 


Initial 


Modified 


Position 


Initial 


Modified 


13381020 


820 


T 


C 


267 


Phe 


Phe 



10 

NOV48b SNP Data: 

NOV48b has five SNP variants, whose variant positions for its nucleotide and 
amino acid sequences are numbered according to SEQ ID NOs:193 and 194, respectively. 
15 The nucleotide sequence of the NOV48b variant differs as shown in Table SNP28. 

Table SNP28. 



Variant 


Nucleotides 


Amino Adds 


Position 


Initial 


Modified 


Position 


Initial 


Modified 


13375777 


107 


A 


G 


14 


His 


Arg | 


13376584 


116 


G 


A 


17 


Ser 


Asn . 


13381146 


448. 


T 


C 


128 


Cys 


Arg 


13378857 


1282 


G 


A 


406 


Gly 


Ser 


13376583 


1297 


C 


T 


411 


Pro 


Ser 



20 

NOV49a SNP Data: 

25 NOV49a has twenty-one SNP variants, whose variant positions for its nucleotide 

and amino acid sequences are numbered according to SEQ ID NOs:195 and 196, 
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respectively. The nucleotide sequence of the NOV49a vafT^ , d5fTerH^h6 , wn , iirf able 1 ' ~* 
SNP29. 

Table SNP29. 



Variant 


Nucleotides 


Amino Acids 


L)/~x pifi /v-n 

r os hi on 


Initial 


lrlUUllJCU 


r uaiui/M 


Tnitifll 

JLIJIlJal 


IVlVJullJGU 


13379126 


lOO 


p 
v^ 


T 


17 


Ala 


Val 


13375663 


919 


p 


VJ 


9fi 


T All 


V ai 


13375662 




T 


p 


96 


T pii 

J_rCU 


Pro 


13379016 ! 


903 


A 


VJ 


JO 


oer 


vjjy 


13378698 




p 


T 
1 


OA 


rne 


rne 


13381282 




p 


T 




nin 

vjin 




13381193 




A 


p 


1 ATI 


Thr 


i nr 


13381194 


<\99 


VJ 


A 


1**/ 






13381283 


631 


A 


G 


165 


Lys 


Lys 


13378699 


840 


G 


A 


235 


Ser 


Asn 


13378106 


909 


A 


G 


258 


Asp 


Gly 


13381284 


924 


A 


G 


263 


Lys 


Arg 


13377887 


954 


A 


G 


273 


Glu 


Gly 


13381285 


967 


C 


T 


277 


Gly 


Gly 


13381286 


1009 


A 


G 


291 


Thr 


Thr 


13377889 


1083 


A 


G 


316 


Gin 


Arg 


13381287 


1107 


A 


G 


324 


Glu 


Gly 


13377890 


1113 


T 


C 


326 


Val 


Ala 


13377891 


1137 


A 


C 


334 


Gin 


Pro 


13381288 


1196 


C 


G 


0 






13381289 


1202 


A 


G 


0 







NOV50bSNPData: 
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NOV50b has three SNP variants, whose variant pdsiff&nfc tontSHhttCIfeotide^ira-*-" ' ~~ p 
amino acid sequences are numbered according to SEQ ID NOs:219 and 220, respectively. 
The nucleotide sequence of the NOV 50b variant differs as shown in Table SNP30. 

Table SNP30. 



Variant 


Nucleotides 


Amino Acids 


Position 


Initial 


Modified 


Position 


Initial 


Modified 


13381192 


216 


G 


A 


48 


Giu 


Glu 


13381177 


602 


G 


T 


177 


Arg 


Leu 


13381190 


698 


C 


T 


0 







NOV52b SNP Data: 

NOV52b has eight SNP variants, whose variant positions for its nucleotide and 
amino acid sequences are numbered according to SEQ ID NOs:229 and 230, respectively. 
The nucleotide sequence of the NOV52b variant differs as shown in Table SNP31. 

Table SNP31. 



Variant 


Nucleotides 


Amino Adds 


Position i 


Initial 


Modified 


Position 


Initial 


Modified 


13381176 


215 


A 


G 


43 


Glu 


Glu 


13376180 j 


320 


C 


T 


78 


Tyr 


Tyr 


13376179 


397 


A 


G 


104 


Gin 


Arg 


13381171 


519 


T 


C 


145 


Ser 


Pro 


13381174 


629 


C 


T 


181 


He 


He 


13381173 


1173 


C 


A 


363 


Gin 


Lys 


13381172 


1174 


A 


C 


363 


Gin 


Pro 


13381169 


1402 


A 


G 


0 
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NOV53c SNP Data: 

NOV53c has two SNP variants, whose variant positions for its nucleotide and 
5 amino acid sequences are numbered according to SEQ ID NOs:237 and 238, respectively. 
The nucleotide sequence of the NOV53c variant differs as shown in Table SNP32. 

Table SNP32. 



Variant 


Nucleotides 


Amino Adds 


Position 


Initial 


Modified 


Position 


Initial 


Modified 


13380578 


424 


C 


T 


136 


Asp 


Asp 


13380577 


869 


A 


G 


285 


Thr 


Ala 



NOV55a SNP Data: 

15 

NOV55a has thirteen SNP variants, whose variant positions for its nucleotide and 
amino acid sequences are numbered according to SEQ ID NOs:245 and 246, respectively. 
The nucleotide sequence of the NOV55a variant differs as shown in Table SNP33. 

20 Table SNP33. 



Variant 


Nucleotides 


Amino Acids 


Position 


Initial 


Modified 


Position 


Initial 


Modified* 


13375283 


272 


C 


T 


0 






13375284 


281 


T 


C 


0 






13377920 


1226 


T 


c 


203 


Ser 


Pro 


13377921 


1447 


C 


T 


276 


Tyr 


Tyr 


13377922 


1765 


C 


T 


382 


Gly 


Gly 


13377907 


2021 


A 


G 


468 


Thr 


Ala 


13377908 


2074 


T 


C 


485 


Tyr 


Tyr 
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13375287 


2153 


G 


c 








/3JLS7 
Leu 




13375288 


2157 


C 


T 


513 


Pro 


Leu 




13375289 


2160 


C 


T 


514 


Thr 


He 




13375290 


2329 


G 


A 


0 








13377903 


2417 


A 


G 


0 








13377904 


2559 


C 


T 


0 









Example E: Potential Role(s) of CG96736-01 in Obesity and/or Diabetes 

5 The NOV55a gene (CG96736-01) is a Na+-dependent neutral amino acid 

transporter that exhibits high affinity electroneutral uptake of neutral amino acids such as 
L-alanine, L-serine, L-threonine, L-cysteine and L-glutamine. This transporter prefers 
neutral amino acids without bulky or branched side chains. It is localized to the plasma 
membrane and has eight putative transmembrane segments. It appears to be a Type Ilia 

10 membrane protein with an N-terminal cytoplasmic tail and a C-terminal extracellular 

segment. In this respect, the expression patter and its function in nutral amino acid uptake 
is an indication of a role for NOV55a in obesity and/or diabetes. 

Obesity and Diabetes are major public health concerns in the developed and 
developing world. It is estimated that over half of the adult US population is overweight 

15 with a body mass index (BMI) greater than the upper limit of normal (25) where the BMI is 
defined as the weight (Kg) / [height (m)] 2 . A common consequence of being overweight is 
hyperlipidemia and the development of insulin resistance. This is followed by the 
development of hyperglycemia - a hallmark of Type II diabetes. Left untreated, the 
hyperglycemia leads to microvascular disease and end organ damage that includes 

20 retinopathy, renal disease, cardiac disease, peripheral neuropathy and peripheral vascular 
compromise. Currently, over 16 million adults in the US are affected and the condition has 
now become rampant among school-age children as a consequence of the epidemic of 
obesity in that age group. 

Several cellular, animal and clinical studies were performed to elucidate the genetic 
25 contribution to the etiology and pathogenesis of these conditions in a variety of 

physiologic, pharmacologic or native states. These studies utilized the core technologies at 
CuraGen Corporation to look at differential gene expression, protein-protein interactions, 
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large-scale sequencing of expressed genes and the associitidrf eft gertttKf V^atSonS'strcrf afc,--* 
but not limited to, single nucleotide polymorphisms (SNPs) or splice variants in and 
between biological samples from experimental and control groups. The goal of such 
studies is to identify potential avenues for therapeutic intervention in order to prevent, treat 
5 the consequences or cure the conditions. 

In order to treat diseases, pathologies and other abnormal states or conditions in 
which a mammalian organism has been diagnosed as being, or as being at risk for 
becoming, other than in a normal state or condition, it is important to identify new 
therapeutic agents. Such a procedure includes at least the steps of identifying a target 

10 component within an affected tissue or organ, and identifying a candidate therapeutic agent 
that modulates the functional attributes of the target. The target component may be any 
biological macromolecule implicated in the disease or pathology. Commonly the target is a 
polypeptide or protein with specific functional attributes. Other classes of macromolecule 
may be a nucleic acid, a polysaccharide, a lipid such as a complex lipid or a glycolipid; in 

15 addition a target may be a sub-cellular structure or extra-cellular structure that is comprised 
of more than one of these classes of macromolecule. Once such a target has been 
identified, it may be employed in a screening assay in order to identify favorable candidate 
therapeutic agents from among a large population of substances or compounds. 

Li many cases the objective of such screening assays is to identify small molecule 
20 candidates; this is commonly approached by the use of combinatorial methodologies to 
develop the population of substances to be tested. The implementation of high throughput 
screening methodologies is advantageous when working with large, combinatorial libraries 
of compounds. 

In an important aspect, the present invention provides a method of identifying a 
25 candidate therapeutic agent for treating a disease, pathology, or an abnormal state or 

condition using a target entity having a specific association with the disease. This method 
includes: 

(a) identification of a target biopolymer associated with the disease, pathology, 
or abnormal state or condition; 

30 (b) contacting the biopolymer with at least one chemical compound; and 

(c) identifying a compound that binds to the biopolymer as a candidate 
therapeutic agent. 
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In important embodiments of this method, the chemical compound is a member of a 
combinatorial library of compounds; the contacting in step (b) is conducted on one or more 
replicate samples of the biopolymer, and the replicate sample is contacted with at least one 

5 member of the combinatorial library. In additional embodiments of this method, the 

biopolymer is included within a cell and is functionally expressed therein. In still a further 
advantageous embodiment, the binding of the compound modulates the function of the 
biopolymer, and it is the modulation that provides the identification that the compound is a 
potential therapeutic agent. In yet further significant embodiments of this method, the 

10 target biopolymer is a polypeptide. 

In a second aspect of the invention, a method for identifying a pharmaceutical agent 
for treating a disease, pathology, or an abnormal state or condition is provided. The second 
method includes the steps of: 

(a) identifying a candidate therapeutic agent for treating said disease, pathology, 
15 or abnormal state or condition by the method described in the preceding paragraph; 

(b) contacting a biological sample associated with the disease, pathology, or 
abnormal state or condition with the candidate therapeutic agent; 

(c) determining whether the candidate induces an effect on the biological 
sample associated with a therapeutic response therein; and 

20 (d) identifying a candidate exerting such an effect as a pharmaceutical agent. 

In significant embodiments of the second method, the biological sample includes a 
cell, a tissue or organ, or is a nonhuman mammal. 

A gene fragment of the mouse Neutral Amino Acid Transporter B was initially 
found to be up-regulated by 6 fold in the adipose tissue of obese mice (AKR) relative to 

25 non-obese mice (C57BL/6J) using CuraGen's GeneCalling™ method of differential gene 
expression. Two differentially expressed mouse gene fragments migrating, at 
approximately 138 and 347 nucleotides in length (Tables MOU-3A and MOU-3B for 
NOV55c (SEQ ID NO:438), and Tables MOU-3C and MOU-3D for NOV55d (SEQ ID 
NO:439) respectively - vertical line) were definitively identified as a component of the 

30 Mouse Neutral Amino Acid Transporter B cDNA (in the graphs, the abscissa is measured 
in lengths of nucleotides and the ordinate is measured as signal response). The method of 
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competitive PGR was used for conformation of the gene al&steiftefit.^Tfig^ 
electropherogramatic peaks corresponding to the gene fragment of the mouse Neutral 
Amino Acid Transporter B are ablated when a gene-specific primer competes with primers 
in the linker-adaptors during the PCR amplification. The peaks at 138 nt length are ablated 
in the sample from both the obese and non-obese mice. 

The direct sequences of the 138.4 and 346.7 nucleotide-long gene fragments and the 
gene-specific primers used for competitive PCR are indicated on the cDNA sequence of the 
Mouse Neutral Amino Acid Transporter B are shown below in bold. The gene-specific 
primers at the 5* and 3* ends of the fragment are in italics. 

Competitive PCR Primer for the Mouse Neutral Amino Acid Transporter B (peak at 

138.4). 

Table MOU-1. NOV55c Gene Sequence (fragment from 564 to 700 in bold, band 
size: 137) (SEQ ID NO:438) 

83 CCAGAGAGGA CCAGAGTGCG AAAGCAGGTG GTTGCTGCGG TTCCCGTGAC CGGGTGCGCC 143 



GCTGCATTCG 


CGCCAACCTG 


CTGGTGCTGC 


TCACGGTGGC TGCGGTGGTG 


GCTGGCGTGG 


203 


GGCTGGGGCT 


GGGGGTCTCG 


GCGGCGGGCG 


GTGCTGACGC GCTGGGTCCC 


GCGCGCTTGA 


263 


CCGCTTTCGC 


CTTCCCGGGA 


GAGCTGCTGC 


TGCGTCTGCT GAAGATGATC 


ATCCTGCCGC 


323 


TCGTGGTGTG 


CAGCCTGATC 


GGAGGTGCAG 


CCAGCTTGGA CCCTAGCGCG 


CTCGGTCGTG 


383 


TGGGCGCCTG 


GGCGCTGCTC 


TTTTTCCTGG 


TCACCACACT GCTCGCGTCG 


GCGCTCGGCG 


443 


TGGGTTTGGC 


CCTGGCGCTG 


AAGCCGGGCG 


CCGCCGTTAC CGCCATCACC 


TCCATCAACG 


503 


ACTCTGTTGT 


AGACCCCTGT 


GCCCGCAGTG 


CACCAACCAA AGAGGTGCTG 


GATTCCTTTC 


563 


TKGATCTCGT 


GAGGAATATT 


TIFOCCCTCCA 


ATCTGGTGTC TGCTGCCTTC 


CGCTCTTTTQ 


623 


CTACCTCATA 


TGAACCCAAA GACAACTCAT 


GTAAAATACC GCAATCCTGT ATCCAGCGGG 


683 


AGATCAATTC 


AACCATGGTC 


CAGCTTCTCT 


GTGAGGTGGA GGGAATGAAC 


ATCCTGGGCC 


743 


TGGTGGTCTT 


CGCTATCGTC 


TTTGGTGTGG 


CTCTGCGGAA GCTGGGGCCC 


GAGGGTGAGC 


803 


TGCTCATTCG 


TTTCTTCAAC • TCCTTCAATG 


ATGCCACCAT GGTCCTGGTC 


TCCTGGATTA 


863 


TGTGGTACGC 


ACCCGTTGGA ATCCTGTTCC 


TGGTGGCCAG CAAGATTGTG 


GAGATGAAAG 


923 


ACGTCCGCCA 


GCTCTTCATC 


AGCCTCGGCA 


AATACATTCT GTGCTGCCTG 


CTGGGCCACG 


983 


CCATCCACGG 


GCTCCTGGTT 


CTGCCTCTCA. 


TCTACTTCCT CTTCACCCGC 


AAAAATCCCT 


1043 


ATCGATTCCT 


GTGGGGCATC 


ATGACACCCC 


TGGCCACTGC TTTCGGGACC 


TCTTCTAGCT 


1103 


CTGCCACCTT 


GCCTCTGATG 


ATGAAGTGTG 


TAGAGGAGAA GAATGGTGTG 


GCCAAACACA 


1163 



tcagccggtt catcctac (gene length is 1668, only region from 83 to 1180 shown) 

Competitive PCR Primer for the Mouse Neutral Amino Acid Transporter B (peak at 
346.7). The gene-specific primers at the 5' and 3' ends of the fragment are in italics. 

Table MOU-2. NOV55d Gene Sequence (fragment from 1 to 347 in italics, band 
size: 347) (SEQ ID NO:439) 

QGATCCCTGC CGCACCGACA CTGGATGCTG TGGCTGTGAC CCTGGGGAAG AGAAGAGCGG 61 
AGATGGCAGA ATCATGGGGG CGGGGCCTCC TGCCACAGCC CCTGGCACTC ACAGGATGGT 121 
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GATGATCTTC ACGAAGTCCA GGGACACCCC GTTTAGTTGT GCGATCAiJS^TEotfdfcfcSS 01 2 2 $ £ 
ACACTGGAAC AGCGCCGCCC CGTCCATGTT GACCGTGGCG CCGATGGGTA GGATGAACCG 241 
GCTGATGTGT TTGGCCACAC CATTCTTCTC CTCTACACAC TTCATCATCA GAGGCAAGGT 301 
GGCAGAGCTA GAAGAGGTCC CGAAAGCAGT GGCCAGGGGT GTCATGA 

(gene length is 347, only region from 1 to 347 shorn) 

Nucleic acid and amino acid sequences for NOV55a and NOV55b are disclosed in 
Table 55a, SNPs for NOV55a and NOV55b are disclosed in Table SNP33 and quantitative 
expression of these genes is shown in Tables AUA - AUK in Example D. 

Tables MOU-3A and MOU-3B show differentially expressed mouse neutral amino 
acid transporter B gene fragment, NOY55c, and Tables MOXJ-3C and MOU-3D shows 
differentially expressed mouse neutral amino acid transporter B gene fragment, NOV55d. 

Tables MOU-3A and MOU-3B. Differentially Expressed Mouse Neutral Amino 
Acid Transporter B Gene Fragment, NOV55c. 
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OTHER EMBODIMENTS 

Although particular embodiments have been disclosed herein in detail, this has been 
5 done by way of example for purposes of illustration only, and is not intended to be limiting 
with respect to the scope of the appended claims, which follow. In particular, it is 
contemplated by the inventors that various substitutions, alterations, and modifications may 
be made to the invention without departing from the spirit and scope of the invention as 
defined by the claims. The choice of nucleic acid starting material, clone of interest, or 

10 library type is believed to be a matter of routine for a person of ordinary skill in the art with 
knowledge of the embodiments described herein. Other aspects, advantages, and 
modifications considered to be within the scope of the following claims. The claims 
presented are representative of the inventions disclosed herein. Other, unclaimed 
inventions are also contemplated. Applicants reserve the right to pursue such inventions in 

15 later claims. 
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CLAIMS 

What is claimed is: 

1. An isolated polypeptide comprising the mature form of an amino acid 
sequenced selected from the group consisting of SEQ ED NO:2n, wherein n is an integer 
between 1 and 124. 

2. An isolated polypeptide comprising an amino acid sequence selected from 
the group consisting of SEQ ED NO:2n, wherein n is an integer between 1 and 124. 

3. An isolated polypeptide comprising an amino acid sequence which is at 
least 95% identical to an amino acid sequence selected from the group consisting of SEQ 
ID NO:2n, wherein n is an integer between 1 and 124. 

4. An isolated polypeptide, wherein the polypeptide comprises an amino acid 
sequence comprising one or more conservative substitutions in the amino acid sequence 
selected from the group consisting of SEQ ID NO:2n, wherein n is an integer between 1 
and 124. 

5. The polypeptide of claim 1 wherein said polypeptide is naturally occurring. 

6. A composition comprising the polypeptide of claim 1 and a carrier. 

7. A kit comprising, in one or more containers, the composition of claim 6. 

8. The use of a therapeutic in the manufacture of a medicament for treating a 
syndrome associated with a human disease, the disease selected from a pathology 
associated with the polypeptide of claim 1, wherein the therapeutic comprises the 
polypeptide of claim 1 . 

9. A method for determining the presence or amount of the polypeptide of 
claim 1 in a sample, the method comprising: 
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(a) providing said sample; 

(b) introducing said sample to an antibody that binds immunospecifically to the 
polypeptide; and 

(c) determining the presence or amount of antibody bound to said polypeptide, 
thereby determining the presence or amount of polypeptide in said sample. 

10. A method for determining the presence of or predisposition to a disease 
associated with altered levels of expression of the polypeptide of claim 1 in a first 
mammalian subject, the method comprising: 

a) measuring the level of expression of the polypeptide in a sample from the 
first mammalian subject; and 

b) comparing the expression of said polypeptide in the sample of step (a) to 
the expression of the polypeptide present in a control sample from a second 
mammalian subject known not to have, or not to be predisposed to, said 
disease, 

wherein an alteration in the level of expression of the polypeptide in the first subject as 
compared to the control sample indicates the presence of or predisposition to said disease. 

11. A method of identifying an agent that binds to the polypeptide of claim 1 , 
the method comprising: 

(a) introducing said polypeptide to said agent; and 

(b) determining whether said agent binds to said polypeptide. 

12. The method of claim 1 1 wherein the agent is a cellular receptor or a 
downstream effector. 

13. A method for identifying a potential therapeutic agent for use in treatment 
of a pathology, wherein the pathology is related to aberrant expression or aberrant 
physiological interactions of the polypeptide of claim 1, the method comprising: 

(a) providing a cell expressing the polypeptide of claim 1 and having a 
property or function ascribable to the polypeptide; 

(b) contacting the cell with a composition comprising a candidate substance; 
and 
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(c) determining whether the substance alters the property or function ascribable 
to the polypeptide; 

whereby, if an alteration observed in the presence of the substance is not observed when 
the cell is contacted with a composition in the absence of the substance, the substance is 
identified as a potential therapeutic agent. 

14. A method for screening for a modulator of activity of or of latency or 
predisposition to a pathology associated with the polypeptide of claim 1, said method 
comprising: 

(a) administering a test compound to a test animal at increased risk for a 
pathology associated with the polypeptide of claim 1, wherein said test 
animal recombinantly expresses the polypeptide of claim 1; 

(b) measuring the activity of said polypeptide in said test animal after 
administering the compound of step (a); and 

(c) comparing the activity of said polypeptide in said test animal with the 
activity of said polypeptide in a control animal not administered said 
polypeptide, wherein a change in the activity of said polypeptide in said test 
animal relative to said control animal indicates the test compound is a 
modulator activity of or latency or predisposition to, a pathology associated 
with the polypeptide of claim 1. 

15. The method of claim 14, wherein said test animal is a recombinant test 
animal that expresses a test protein transgene or expresses said transgene under the control 
of a promoter at an increased level relative to a wild-type test animal, and wherein said 
promoter is not the native gene promoter of said transgene. 

16. A method for modulating the activity of the polypeptide of claim 1, the 
method comprising contacting a cell sample expressing the polypeptide of claim 1 with a 
compound that binds to said polypeptide in an amount sufficient to modulate the activity 
of the polypeptide. 



17. A method of treating or preventing a pathology associated with the 
polypeptide of claim 1, the method comprising administering the polypeptide of claim 1 to 
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a subject in which such treatment or prevention is desired in an amount sufficient to treat 
or prevent the pathology in the subject. 

18. The method of claim 17, wherein the subject is a human. 

19. A method of treating a pathological state in a mammal, the method 
comprising administering to the mammal a polypeptide in an amount that is sufficient to 
alleviate the pathological state, wherein the polypeptide is a polypeptide having an amino 
acid sequence at least 95% identical to a polypeptide comprising the amino acid sequence 
selected from the group consisting of SEQ ID NO:2n, wherein n is an integer between 1 
and 124 or a biologically active fragment thereof. 

20. An isolated nucleic acid molecule comprising a nucleic acid sequence 
selected from the group consisting of SEQ ID NO:2n-l, wherein n is an integer between 1 
and 124. 

21. The nucleic acid molecule of claim 20, wherein the nucleic acid molecule is 
naturally occurring. 

22. A nucleic acid molecule, wherein the nucleic acid molecule differs by a 
single nucleotide from a nucleic acid sequence selected from the group consisting of SEQ 
ID NO: 2n-l, wherein n is an integer between 1 and 124. 

23. An isolated nucleic acid molecule encoding the mature form of a 
polypeptide having an amino acid sequence selected from the group consisting of SEQ ED 
NO:2n, wherein n is an integer between 1 and 124. 

24. An isolated nucleic acid molecule comprising a nucleic acid selected from 
the group consisting of 2n-l, wherein n is an integer between 1 and 124. 

25. The nucleic acid molecule of claim 20, wherein said nucleic acid molecule 
hybridizes under stringent conditions to the nucleotide sequence selected from the group 
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consisting of SEQ ID NO: 2n-l, wherein n is an integer between 1 and 124, or a 
complement of said nucleotide sequence. 

26. A vector comprising the nucleic acid molecule of claim 20. 

27. The vector of claim 26, further comprising a promoter operably linked to 
said nucleic acid molecule. 

28. A cell comprising the vector of claim 26. 

29. An antibody that immunospecifically binds to the polypeptide of claim 1. 

30. The antibody of claim 29, wherein the antibody is a monoclonal antibody. 

31. The antibody of claim 29 T wherein the antibody is a humanized antibody. 

32. A method for determining the presence or amount of the nucleic acid 
molecule of claim 20 in a sample, the method comprising: 

(a) providing said sample; 

(b) introducing said sample to a probe that binds to said nucleic acid molecule; 
and 

(c) determining the presence or amount of said probe bound to said nucleic 
acid molecule, 

thereby determining the presence or amount of the nucleic acid molecule in said sample. 

33. The method of claim 32 wherein presence or amount of the nucleic acid 
molecule is used as a marker for cell or tissue type. 

34. The method of claim 33 wherein the cell or tissue type is cancerous. 

35. A method for determining the presence of or predisposition to a disease 
associated with altered levels of expression of the nucleic acid molecule of claim 20 in a 
first mammalian subject, the method comprising: 
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a) measuring the level of expression of the nucleic acid in a sample from the 
first mammalian subject; and 

b) comparing the level of expression of said nucleic acid in the sample of step 
(a) to the level of expression of the nucleic acid present in a control sample 
from a second mammalian subject known not to have or not be predisposed 
to, the disease; 

wherein an alteration in the level of expression of the nucleic acid in the first subject as 
compared to the control sample indicates the presence of or predisposition to the disease. 

36. A method of producing the polypeptide of claim 1 , the method comprising 
culturing a cell under conditions that lead to expression of the polypeptide, wherein said 
cell comprises a vector comprising an isolated nucleic acid molecule comprising a nucleic 
acid sequence selected from the group consisting of SEQ ID NO:2n-l, wherein n is an 
integer between 1 and 124. 

37. The method of claim 36 wherein the cell is a bacterial cell. 

38. The method of claim 36 wherein the cell is an insect cell. 

39. The method of claim 36 wherein the cell is a yeast cell. 

40. The method of claim 36 wherein the cell is a mammalian cell. 

41 . A method of producing the polypeptide of claim 2, the method comprising 
culturing a cell under conditions that lead to expression of the polypeptide, wherein said 
cell comprises a vector comprising an isolated nucleic acid molecule comprising a nucleic 
acid sequence selected from the group consisting of SEQ 3D NO:2n-l, wherein n is an 
integer between 1 and 124. 

42. The method of claim 41 wherein the cell is a bacterial cell. 

43. The method of claim 41 wherein the cell is an insect cell. 



583 



WO 03/029424 PCT/US02/31373 

44. The method of claim 41 wherein the cell is a yeast cell. 

45. The method of claim 41 wherein the cell is a mammalian cell. 
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